Streaming live to YouTube via WebRTC with Firefox (kinda)

April 18, 2018 Lorenzo Miniero

We all read the news recently about YouTube opening the doors to WebRTC as a way to start a live stream. The webrtcHacks guys, and the amazing Philipp Hancke in particular, immediately did their usual great job in studying how that works. Among other things, they found out that, as too often happens (and without any valid reason at all, really), this only works if you’re using Chrome. Firefox? No luck. Edge? Get out of here! Safari? Well, I don’t care much about Safari, to be honest… 🙂

Anyway, this got me thinking. What about playing with this myself, to get Firefox to send something via WebRTC, and see it published live on YouTube instead? Maybe with some HTML5 canvas stuff to add some spice. With the Kamailio World Dangerous Demos season opening (and if you’ve ever attended or participated at the Dangerous Demos, you know how James Body will hunt you down until you agree to participate! 😀 ), this became the perfect opportunity to tinker with this, which is exactly what I did! Well, kind of…

What I needed was:

a way to capture video in the browser, edit it somehow, and use it in a WebRTC PeerConnection;
a WebRTC server to receive the stream from the browser;
something to translate that stream to whatever makes YouTube Live happy.

The first part was interesting, as I had never done that before. Or to be more precise, I have obviously captured and published ton of WebRTC streams in these past few years, but I never tried mangling with the captured video on the browser side. I knew you could kinda use HTML5 canvas elements for this, but I had never played with it, so I decided to do that now. And boy, what fun it was!

It basically sums up to a few different steps:

create an HTML5 canvas element to draw on;
get a MediaStream via the usual getUserMedia;
put that MediaStream in an HTML5 video element;
start painting the video frames on the canvas, plus other stuff that may be nice (text overlays, images, etc.);
get a new MediaStream from the canvas using captureStream();
use that new MediaStream as a source for a new PeerConnection;
keep on drawing on the canvas as if there’s no tomorrow!

Sounds like a lot of steps, but they were actually very easy to setup and go through. In just a few minutes, I had some basic code that allowed me to capture my webcam and add some overlay to it: a logo in the upper right, a semitransparent bar on the bottom, and some text over that as an overlay. Tinkering with the code I made those dynamic as well, so that I could update them dynamically. I’m sure many of you who played with canvas before are laughing at how ridicolously simple this is for you, but for poor old me it was a big accomplishment! 😀

Anyway, the cool part was that I had some basic video editing working in a test web page, and a way to use that as a source for a PeerConnection. The next step was getting this WebRTC stream to a server where I could play with this some more. Unsurprisingly enough, I used Janus for the purpose… The idea was simple: I needed something that would allow me to receive the WebRTC stream, and then use it somewhere else. Considering this is one of the key reasons I started working on Janus in the first place, a few years ago, it was obviously the perfect choice! Specifically, I decided to use the Janus VideoRoom plugin for that. In fact, as anticipated I needed a way to make the incoming WebRTC stream available to an external component for processing, in this case for translating it to the format YouTube Live expects for publishing purposes. The VideoRoom plugin, besised its integrated SFU functionality, has a nice feature called “RTP Forwarders”, which allows exactly for that. I’ll explain later why and how it helped in this context.

Finally, I needed something to translate the WebRTC stream to the format YouTube Live expects. As you may know, the traditional way to do that is using RTMP for the purpose. There are several different softwares that can help with that, but I chose to go simple, and use FFmpeg for the job: in fact, I didn’t really need any editing or publishing functionality (this I had done already), but only something that could translate to the right protocol and codecs, which is what FFmpeg is very good at doing. Obviously, in order for this to work, I first needed to get the WebRTC stream to FFmpeg, which is where the aforementioned “RTP Forwarders” can help. Specifically, as the name suggests “RTP Forwarders” simply relay/forward RTP packets somewhere: in the context of the Janus VideoRoom, they provide a way to relay media packets coming from a WebRTC publisher to one or more remote addresses using plain (or encrypted, if needed) RTP. Since FFmpeg does support plain RTP as an input format (using an SDP description to bind on the right ports and specify the audio/video codecs in use), this was the best way to feed it with the WebRTC media streams!

At this point, I had all I needed:

the browser as an editing/publishing software (canvas+WebRTC);
Janus as an intermediary (WebRTC-to-RTP);
FFmpeg as a transcoder (RTP-to-RTMP).

Quite simple, isn’t it?

That said, the last step was testing all this. It all worked as expected in a local test with good old red5 as an open source RTMP server, but clearly the real challenge was getting it work with YouTube Live itself. So I went to the dashboard for the Meetecho YouTube account, verified it, waited the usual 24 hours sipping a soda, and then got the required info to publish a stream. These basically consist in the RTMP server to connect to, and a unique (and secret) key to identify the stream.

Digging around, I found some nice snippets that showed how to stream to YouTube Live with FFmpeg, modified the script to use my source and target info, in order to publish there instead of my local RTMP server, and voilà! I got it working! Well, it didn’t always work perfectly, and there were some issues here and there, but for a demo it worked quite nicely 🙂

Hi, YouTube Live!

That’s it, really: no other “magic” needed. This could easily be turned into a service of sorts, by improving the editing part with some serious canvas job (what I did was really basic) and making the “RTP Forwarding + FFmpeg + YouTube Live credentials” part dynamic (e.g., in terms of ports and accounts to use), in order to support multiple streamers and multiple events, but the nuts and bolts are there.

Yes, I know what you’re thinking: this is kind of smoke and mirrors, really. Meaning that yes, I’m using WebRTC to publish, and yes, it’s getting to YouTube Live eventually, but it’s not a direct step. What I did was basically taking advantage of the Janus flexibility to handle and process a WebRTC stream, by having an FFmpeg helper then do the actual broadcasting to YouTube “Ye Olde” way. Anyway, it’s still quite cool! The usage of an HTML5 canvas on the client side made it easy to somehow “edit” the publishing part, giving me quite some creative freedom. Besides, it still felt good to use WebRTC for that!

Looking forward to your thoughts!

PS: If you’re interested in seeing this in action, as anticipated I plan to try and show that (possibly with even more dangerous tweaks and approaches) during the Dangerous Demos session at the upcoming Kamailio World, which will happen on Day Two. I have a presentation on the same day on Security, Authentication and Privacy in WebRTC, so that may be one more reason to be there 😉
I’ll also attend the OpenSIPS Summit in Amsterdam in a couple of weeks, to present a work done together with the amazing QXIP guys, so that might be a good opportunity to see this live as well.

Lorenzo Miniero

I'm getting older but, unlike whisky, I'm not getting any better