What’s cooking in Janus?

April 30, 2020 Lorenzo Miniero

It’s been a while since I last wrote a blog post, here: more precisely, it was more than 4 months ago when I documented our efforts in getting Real-Time Text support to WebRTC and the SIP plugin. I realized that, for the amounts of things we work on every day, we also do a pretty bad job at telling people about it, which means many may not actually be aware of cool new features or experimentations we’ve started, or even completed. This is even more important now, in a world that has so suddenly and dramatically changed in these past few weeks: real-time communications have become increasingly important, probably more than they ever were, and more and more people and companies have indeed started looking at Janus for their WebRTC needs.

As such, rather than write another post on some random specific argument, I thought I’d do a little summary of some of the things that have happened in Janus these past few months, and what’s actually cooking right now. Hopefully that will intrigue you enough to start tinkering with them, possibly surprise you with things you didn’t know were possible already, and why not, help you come up with new ideas for that cool WebRTC project you always wanted to make!

We’ll begin with what’s already here, and then move to what we’re working on, so buckle up and let’s get started…

I (kinda) learned how to make a pizza at home!

Is this important? Well, that depends on how often you eat pizza, and how much you enjoy it… Living in the land where it was born and made amazing, I used to eat one AT LEAST once a week, and with the whole country being on lockdown (which included delivery services being shut down too) that was a huge blow! So yes, it was important for me…

Is this relevant to Janus, or this blog post for that matter? Well, it’s “cooking”, but apart from that, no, not at all!

So why am I still typing this section? Not sure… let’s move on to some actually useful info instead

A ton of fixes!

Don’t find this exciting? Well, you should! It may be an anticlimatic way to start this roll, but I actually think this is a big deal. While new features are always exciting (and I definitely like jumping on new stuff!), stability and reliability are actually very important when it comes to deploying a WebRTC service in production. We worked a lot on that these past few months, and our recent integration of fuzzing techniques, static analysis via Coverity and continuous integration via Travis CI definitely helped with that.

One of the most important fixes was related to occasional audio/video desyncs that could happen from time to time: after investigating the issue for quite some time, we eventually came up with a fix on our RTCP code that finally put a stop to that!

But we actually fixed a lot of stuff pretty much everywhere, ranging from ICE-related issues, to random crashes or deadlocks that could happen in different plugins for different reasons. While the journey to a bug-free software is one that admittedly will never end, I think it’s safe to say that Janus has never been as stable as it is now!

HTTP transport is now single threaded

This took a while (the pull request had been open for a couple of years before we merged it), but it was an important change. All our transport plugins typically use a single thread, or a handful of them, to handle incoming traffic for the Janus and Admin APIs. Up to some time ago, this was NOT the case for the HTTP transport, in part due to the different nature of HTTP as a bidirectional application-level protocol, and in part due to how we were using the features provided by libmicrohttpd, the library we use for the HTTP server functionality.

With HTTP, every request may or may not use new connections; besides, since HTTP needs approaches like long-polling to perform asynchronous notifications, this typically requires some of those connections to be kept open and idle for some time, and unavailable for other requests. For this reason, for a long time we actually configured libmicrohttpd to use a separate thread to notify us about each request: this ensured that we’d never end up not being notified about incoming messages because we were using, e.g., a smaller pool of threads all filled up with pending long-polls. Needless to say, a thread per connection could end up resulting in A LOT of short-lived threads spawning in Janus, especially when dealing with many users; besides, it could result in sometimes processing requests out of order, depending on which request we’d be notified about first. As such, it was way less than ideal.

After some enhancements were made to the libmicrohttpd library to properly support single threading (which was possible already, but apparently not always reliable), we started investigating how to take advantage of the feature, which is how we ended up refactoring the whole transport plugin to do exactly that. It took a few iterations to get it right (some bugs were opened and promptly fixed), but now it’s working as expected. This provided a considerable boost in performance, as it greatly reduced the number of threads in Janus.

String UUIDs in several plugins

While this may not look like an important change, it’s actually something many developers were interested in since the very beginning of Janus.

Historically, Janus has always used large integers for most of its unique identifiers: this included, for instance, unique identifiers to address conference rooms, participants, streaming mountpoints and so on. Initially this was mostly out of possible concerns for any kind of lookup capability, but with the hashing functionality that glib provides this is actually a non-issue. As such, it’s really just a convention right now, and most identifiers are still numeric just to ensure backwards compatibility (as changing the format of identifiers may have a considerable impact on the Janus API, for instance).

That said, there are cases were using strings rather than numbers may actually be pretty advantageous: e.g., Janus may be part of a service that uses UUIDs for identifying users, and so being able to use the same address space in the involved plugins too may simplify the process (which would require keeping a mapping between UUID and unique numeric identifier somehow instead).

To cover that requirement, we added support for string-based identifiers to a selected group of plugins, namely VideoRoom, AudioBridge, TextRoom and Streaming plugins. Which identifiers to use can be chosen at startup, so this makes it both backwards compatible (in case you were fine with numbers and want to keep on using those) and forward looking (in case you needed strings instead).

DSCP configuration for WebRTC traffic

Being able to give different priorities to WebRTC and other traffic can be quite important, from a network perspective, and a good overview on why was explained a couple of years ago in this informative blog post by the always knowledgeable Gustavo Garcia. In that post, Gustavo introduced to WebRTC developers how DSCP (Differentiated Services Code Point) could be used for the purpose, and a possible way to tell the browser to take advantage of that.

In the meanwhile, work has started to provide a proper API to tell browsers how to configure a custom priority for WebRTC traffic. That said, that only covers WebRTC traffic going out of browsers, while we were also particularly interested in figuring out how we could do the same for traffic going out of Janus. Luckily, the library we use exchanging media packets, libnice, provides a simple API to configure a DSCP ToS (Type of Service) value, which is what we used to enforce a configurable DSCP setting. While at the moment this is a global value that impacts all PeerConnections, the idea is to make this actually configurable per-PeerConnection too in the future.

Multiple lines and call transfer in SIP plugin

SIP news usually don’t warm the hearts, especially of those people who look at WebRTC for what it can bring in terms of innovation to the world of communications. That said, SIP is still a widespread protocol, and there’s A LOT of people using Janus as an easy and quick way to turn their “legacy” SIP infrastructure into something that is WebRTC compliant right away.

For this reason, we actually try and improve the SIP plugin all the time, in order to support as many features as are usually needed, while still keeping it easy to use. A few months ago, a company interested in additional functionality sponsored the development of several new features, including:

support for SIP SUBSCRIBE/NOTIFY;
support for multiple call lines;
support for call transfer (blind and attended).

While support for SIP events was relatively easy to do, the other two features required some more work, and ended up making the SIP plugin much more powerful than it was originally. The ability to have multiple lines associated to the same SIP account meant that implementing web-based applications for replacing desktop phones (e.g., for agents) much easier, and the new support for call transfer filled a much needed gap as well.

This was actually supposed to be the main topic of my presentation at this year’s OpenSIPS Summit, which has unfortunately been canceled due to the ongoing epidemic. I still want to share some more information about this, though, so I may end up writing a dedicated blog post sooner or later.

Playback of static Opus files

With WebRTC so focused on streaming live content in real-time, we sometimes forget that there are cases where being able to stream or inject pre-recorded content can actually be quite helpful as well. For the vast majority of times this is needed, in Janus we typically rely on the Streaming plugin, and then leave the playout of pre-recorded content up to some external application (e.g., Gstreamer or FFmpeg), whose responsibility is packetizing, and possibly transcoding, some existing file to then stream via RTP to Janus. That said, for audio streaming there are easier approaches that don’t require these external sources. Specifically, since Opus is mandated in WebRTC, it’s sometimes easier to just go through an Opus file manually, and have one of the Janus plugins generate RTP packets accordingly.

This is exactly what we ended up doing for two different plugins. Specifically, we first added support for this feature to the Streaming plugin (to allow the setup of mountpoints associated to high quality pre-recorded audio content), and then did the same for the AudioBridge plugin as well (to facilitate use cases like announcements and music-on-hold).

In order to showcase how this works in practice, we configured a couple of Streaming plugin mountpoints in the online demo to use that functionality (where I shamelessly couldn’t resist the temptation of streaming one of my music efforts ).

Data channel improvements

While data channels are typically not as used as audio and video, in WebRTC, they’re not second-class citizens in Janus at all, as we also explained when delving a bit more in detail in this very blog. As such, it should come as no surprise that we regularly improve the support we have for data channels in Janus.

One enhancements we made a few weeks ago was the support for binary data. Since the very beginning, data channels support in Janus had been limited to text only: this meant that, in order to support binary data, you were basically required you to use text-based encodings like Base64 to be able to relay the data among WebRTC participants via Janus. Now binary data is explicitly supported, which made it much easier to accomodate some custom use cases as well.

We already anticipated, in another blog post, some new data channel functionality we’re working on as part of the Real-Time Text effort: in fact, the specification for using RTT over WebRTC requires support for the SCTP subprotocol. Since in Janus we didn’t support that yet, we implemented it as part of that effort. Considering support for subprotocols may actually be beneficial to other use cases rather than just RTT, we’re planning to move that part to a separate pull request as well, in order to have it merged sooner.

Finally, there’s another branch that is just sitting there and waiting to be merged, which is aimed at making the delivery of data channel messages more reliable. More precisely, this patch adds a new callback that Janus plugins can see invoked, which notifies them any time a data channel becomes writable. This is actually useful for two different reasons:

it lets the plugin be aware of exactly when the data channel has really been established (success in establishing a PeerConnection is not a valid trigger for that, since it’ precedes the SCTP handshake), and so when data can be first sent to a user;
it tells the plugin when the internal SCTP buffers are empty, and so data can be queued.

Both are actually quite important events, as they allow plugins to properly decide when they can start sending data, and possibly also how to pace the delivery of larger amounts of it.

Insertable Streams and E2EE

I’m sure you’ve heard a lot about end-to-end encryption and real-time communications, these past few weeks, especially when it turned out companies like Zoom were abusing the term for something that wasn’t really E2EE at all.

While WebRTC does support E2EE by design, that’s only true for sessions that are truly peer-to-peer: any time you go through a server like Janus, what you end up are still secure sessions, but only hop-by-hop. This means that you’ll have secure channels with the server, and so will the people you communicate with, but the server will have access to the unencrypted media. This is usually fine, as most of the times that media may need to be unencrypted in order to provide some additional feature, but is not mandatory. This is what led to different efforts in the standardization community to come up with ways to provide actual E2EE even though a media server.

A few years ago an approach called PERC Lite was proposed, which we did implement as a branch in Janus, but was eventually discarded. One of the newest efforts in that direction have been Insertable Streams, which basically allow the JavaScript developer to “transform” the encoded streams on the way in and out in. If such transforms actually implement encrypting functionality, the end result is that the streams that will travel with WebRTC will still be routable by the server, but without giving the server access to the media content. A good example of such a crypto transform are SFrames.

Since this is obviously of interest to the community, we added a first experimental support for the Insertable Streams in Janus as well. The branch also includes a modified version of the EchoTest demo as a simple proof of concept: the demo uses some simple transforms to manipulate the traffic, to show how the resulting media is accessible to the user but not to Janus itself. We tested this briefly with the VideoRoom as well, just to confirm that even in an SFU scenario only people with the right transforms could consume the media, while others (potentially malicious participants) could not (which is where the silly screenshot above comes from).

Surround audio via Opus

Audio in WebRTC is quite great already, since Opus provides excellent quality. Besides, since it seamlessly supports different sampling rates, it’s quite flexible as well. The only limitation it had so far was that it was stuck to two channels tops: that said, in the vast majority of cases, you’ll just see mono streams anyway, with a few stereo streams for some more specific use cases.

Recently, though, our good friends at CoSMo Software found out a way to enable support for multiple audio channels in libwebrtc, which of course opened the door to a ton of different opportunities. In fact, this basically allows having surround audio (5.1 and 7.1) in WebRTC sessions, which is quite exciting in a range of different application scenarios. To make a simple example, it’s basically the feature that Stadia itself is using to provide surround audio in the games they stream.

As soon as we had enough information on how to proceed, we added support for this additional flavour of Opus in Janus too. At the moment, the support is still experimental, and is currently hardcoded to 5.1 (no 7.1 support yet); that said, it works already, and the branch comes with a modified version of the EchoTest demo to show how it works in practice.

AV1 and H.265 are coming!

The codec war between VP8 and H.264 hasn’t even cooled down yet, that now we may see a new one at the horizon…

As you probably know already, AV1 is an exciting new royalty free codec that many big companies started working on together, and is full of interesting functionality like native SVC support; thanks to the efforts by our pals at CoSMo Software (them again!) it will soon be part of the codecs supported by Chrome in WebRTC, so we’d definitely love to have it in Janus soon.

On the other end, Safari recently started experimenting with H.265/HEVC as a new codec in their WebRTC stack instead. Just like H.264, H.265 is a nightmarish mess when it comes to licensing, so you can easily figure out where I stand here… that said, it may make sense to at least have some form of support for that codec too, in case it’s useful to some people in some scenarios.

This is what led me to start working on adding preliminary support for both, in this new branch. It’s still very much WIP, and far from done, but it should already allow you to use those codecs for some tests, so I encourage you to give it a try and provide feedback!

Hey, what about JanusCon?!

Good question! We hosted our first JanusCon here in Naples last September, and we had an amazing time; we chatted with interesting people, ate great food and saw some great presentations from people all over the world. It goes without saying that we definitely want that to happen again!

That said, we all know how the world is a very different place than the one we lived in last September… that’s unlikely to change in the few months that separate us from the Fall, or at least change enough to make travel really safe again, so we’ll most definitely not host a second edition in person this year.

BUT! Hey, we’re supposed to be experts in real-time communications, and we’ve actually provided streaming and remote participation services for a long time too. As such, we’ll most definitely organize this new JanusCon² edition as an online event, so stay tuned for more updates in the next few weeks! Will it be the same thing? Probably not: one of the best things in conferences like this is meeting amazing people and spend quality time together. Even worse: it means NO Napoli pizza for you… But what we can promise is that we’ll try to have some amazing content to share!

That’s all, folks!

Hope you enjoyed this post! Please stay safe out there, as I want to meet you all (again or for the first time) in person in the real world, sooner or later

Happy hacking!

Lorenzo Miniero

I'm getting older but, unlike whisky, I'm not getting any better