Thank you for the answer!
So, both options are server-side only.
The first one won't work for us, because there can be only one watermark (we need dynamic captions, several per video, added on the fly depending on live stream content).
The second option can theoretically work, but, again, we need to know caption text on server side promptly, while stream is running, which sounds not trivial.
And just to be sure: is there any client-side possibility to add dynamic captions to the stream? I was thinking about canvas-based solution, when camera stream is first drawn on canvas, but this solution won't work in iOS Safari, as it relies on
which isn't supported there...