In the last blog post we introduced the Janus Event Handlers, and how they can be used to be asynchronously notified about whatever is happening within a Janus instance. We also presented a real example, where we saved the information associated with an EchoTest plugin session and the related WebRTC PeerConnection to a database for some analysis and evaluations.

While a useful introduction, that article was not a comprehensive view of what can be done with events, especially in a complex application that may involve multiple multimedia streams, or even multiple Janus plugins at the same time. In fact, when this happens another question comes to mind: is it possible to correlate information coming from different sources? Is this automatic, or is there work to do? The answer is that, yes, you can indeed correlate information coming from different events in Janus.

First of all, we should define what we mean by correlation. Do we want to know which handles/PeerConnections belong to the same person? Or simply know what transport that person is using to talk to Janus? Figure out the topology of an application in general, e.g., who’s publishing and who’s subscribed to who in a videoconference? All of the above? Whatever it is you’re interested in, Janus events do provide you with ways to implement some correlation: that said, different pieces of information may be in different events, which means you may have to do a bit of a combination in order to reconstruct what you aim to find out. Now, is that easy? Well, it’s not super-hard, but it does indeed require some work, which is what this post tries to address, again with a practical approach.

 

Correlating handles

Let’s start from the first example. How can we figure out which handles/PeerConnections belong to the same person? In general, this sounds like a straightforward enough question. In fact, the Janus demos themselves present scenarios where a single user establishes multiple PeerConnections in the same context: a simple example is the VideoRoom demo, where the same user creates a PeerConnection to publish their contribution, and possibly more to subscribe to the other participants in the room. Intuitively speaking, then, we can assume that, in order to figure out which handles/PeerConnections belong to the same person, we simply need to look at the session ID: in fact, the VideoRoom demo (as other demos, in that respect) creates a session for the user everytime the page is loaded, and all handles are created within that specific session.

In general, this works, but unfortunately it’s not something you can safely rely upon. In fact, in many applications where the Janus API is wrapped on the server-side by an application server, it’s not uncommon to create a single session where handles associated to different users are shared. In that case, relying on the session does not provide the correlation that is needed, as all users would be associated with the same. In order to overcome this, we recently introduced a new feature: an opaque identifier users/applications can provide when attaching a handle. If the user/application makes sure to set the same value for all the handles associated to a specific user, in that case correlation can be done on that. In terms of the dumb and trivial database example we created for the previous blog post (you can find the updated version of both database and node.js application that covers this blog post here), it would just be a matter of modifying the handles table:

ALTER TABLE handles ADD COLUMN opaque VARCHAR(100);

and modify the node.js code so that it stores the opaque ID in the event too:

	} else if(json.type === 2) {
		// Handle event
		[..]
		var opaqueId = json["event"]["opaque_id"];
		// Write to DB
		var insert = { session: sessionId, handle: handleId, event: event, plugin: plugin, timestamp: when, opaque: opaqueId };
		[..]

Should opaque identifiers be available, you can use something like this to identify unique “user sessions”:

MariaDB [janusevents]> select distinct opaque from handles where opaque is not null;
+------------------+
| opaque           |
+------------------+
| XYEhmVTKMVqbnCBF |
| AABXYn3PgTTvkcHu |
+------------------+

Of course, this only works if opaque identifiers are provided as part of the handles creation, but if you have control over the application that is easy to ensure. Besides, since these identifiers are complate opaque as far as Janus is concerned, you can put there whatever you want, e.g., something that makes sense to your application in general, which might help even more for correlation purposes.

 

Associating transport information

Now that we’ve seen how we can correlate multiple handles and associate them to a user, we might want to know how the user is connected to Janus. For this, we can use the Transport events in Janus, another feature we only recently merged. Specifically, everytime a session is created Janus now adds some transport-related information, specifically by telling us which transport “instance” of the user originated the session in the first place: e.g., if it was over HTTP, WebSockets, or something else, and which one in terms of the internals of the transport module.

To make a simple example, if an HTTP transport event looks like this:

   {
      "type": 128,
      "timestamp": 1484656548988936,
      "event": {
         "transport": "janus.transport.http",
         "id": "0x7f1fc00008c0",
         "data": {
            "event": "request",
            "admin_api": false,
            "ip": "::1",
            "port": 59294
         }
      }
   }

and then we get a “session created” event that looks like this:

  {
      "type": 1,
      "timestamp": 1484656548989422,
      "session_id": 4723806468072828,
      "event": {
         "name": "created",
         "transport": {
            "transport": "janus.transport.http",
            "id": "0x7f1fc00008c0"
         }
      }
   }

then we know that session 4723806468072828 was created via Janus API transported over the HTTP plugin, and the previously referenced 0x7f1fc00008c0 transport event gives us more information about that. Considering different transport plugins would provide different transport-specific information (e.g., for WebSockets we could be notified whenever a new WebSocket connection has been accepted or when it is closed), knowing how to interpret those events gives us more information (e.g., the source IP of the user).

A simple way to enrich our existing node.js/DB example is simply extending the sessions table to add a transportId column:

ALTER TABLE sessions ADD COLUMN transportId VARCHAR(100);

and again modify the node.js code to save the additional piece of information:

	if(json.type === 1) {
		// Session event
		[..]
		var transportId = null;
		if(json["event"]["transport"])
			transportId = json["event"]["transport"]["id"];
		// Write to DB
		var insert = { session: sessionId, event: event, transportId: transportId, timestamp: when };
		[..]

In order to map this transportId to previous transport events, we need to modify the transports table as well to save events associated to those. Since the table we created at the time assumed a syntax that is actually different from the one really used (there were no transport events at the time), we need to drop that and create a new one:

DROP TABLE transports;
CREATE TABLE transports (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, transport VARCHAR(100) NOT NULL, transportId VARCHAR(100) NOT NULL, event VARCHAR(300) NOT NULL, timestamp datetime NOT NULL);

If you’re interested to see the code for saving transport events to the database in our silly example, please refer to the attached code, which we omit for brevity.

If we now try one of the demos again (e.g., the same EchoTest scenario we tried last time), we’ll find the database has been enriched with the additional information we needed. Assuming the session ID is 255404219017016:

MariaDB [janusevents]> select * from sessions where session=255404219017016;
+----+-----------------+---------+---------------------+----------------+
| id | session         | event   | timestamp           | transportId    |
+----+-----------------+---------+---------------------+----------------+
| 16 | 255404219017016 | created | 2017-01-20 17:00:10 | 0x60e00003ee80 |
+----+-----------------+---------+---------------------+----------------+

we find out it was created by transport ID 0x60e00003ee80, which in the associated table gives us this info:

select transportId,transport,event from transports where transportId='0x60e00003ee80';
+----------------+----------------------+---------------------------------------------------------------+
| transportId    | transport            | event                                                         |
+----------------+----------------------+---------------------------------------------------------------+
| 0x60e00003ee80 | janus.transport.http | {"event":"request","admin_api":false,"ip":"::1","port":37630} |
+----------------+----------------------+---------------------------------------------------------------+

At this point, you may be wondering: why associate the transport information to the session, if before we said that a session itself cannot ensure we’re talking of the same user? The answer is that here we’re talking of the Janus API: if users talk to Janus directly, it makes sense to assume a single session identifies a specific user, and that the transport somehow refers to them. Should an application wrap the Janus API on the server side, not only a session might loose that correlation power as explained before, but the transport information itself would be less meaningful, as basically all connections would come from the same place, no matter which user they refer to. In that case, the transport-related information should be tracked by the application itself, as it would be responsible of talking to users its own way.

 

What about correlations with other users?

This is where things get more complicated. In fact, how your streams interact with other users and viceversa is something that Janus itself (as in, the Janus core) cannot tell you, as that falls within the application logic of the plugins. To make a simple example, Janus has no idea whether the video you’re sending goes back to you (Echo Test) or is sent to a single (VideoCall) or multiple (VideoRoom) participants: it only takes care of making sure your media gets to the plugin you’re attached to, and then whatever happens in the plugin is something the core is unaware of. The fact the media might be passed to a different handle and thus handled by the core again is something the core is unaware of: it just sees media flowing, and makes sure the user gets it.

This means that, in order to envisage what the correlation with other users is, or more in general have a better idea of the media topology of an application, you cannot prescind from the information plugins notify you about, which is necessarily plugin-specific. As such, there is no common or plugin-agnostic way to figure this out. Translated: you need to know the syntax of the events the plugins send you and their meaning, in order to understand what information you can extract from them for correlation purposes.

While unfortunate, this is an understandable requirement within Janus. In fact, its power resides mainly in the fact you can at any time implement new plugins to handle media in a completely novel way. This means that each plugin may have a completely different way of handling media, and the fact that Janus doesn’t need to know about that makes it easily adaptable and extensible in that regard.

That said, let’s look at a practical example. Let’s try to track, for instance, a specific conference implemented with the VideoRoom plugin. For the sake of simplicity, let’s assume we want to see who is in the default VideoRoom 1234, and who’s subscribed to who. As we said, there’s no way to figure that out by just looking at the events we already saw: looking at the opaque identifier or the session, we can say that two handles belong together, but they don’t tell us whether one of them is used to publish into a conference, or that they both belong to Bob. This is something that the plugin-specific events can tell us instead, as the VideoRoom will inform us when rooms are created, somebody starts publishing and subscriptions start, and so on.

To make a simple example, this is what an event about a user joining as a participant looks like:

{
   "type": 64,
   "timestamp": 1484849916135532,
   "session_id": 5110255302092189,
   "handle_id": 2467253471689504,
   "event": {
      "plugin": "janus.plugin.videoroom",
      "data": {
         "event": "joined",
         "room": 1234,
         "id": 8359209968968477,
         "private_id": 6762407635026307,
         "display": "ciccio"
      }
   }
}

In this example, the user called ciccio and with participant ID 8359209968968477 has successfully joined the videoroom 1234. This is application-level information we were missing: everything else (which handle is being used for that, the opaque ID to correlate it to other handles, the transport, etc.) we already know. Should an event like this arrive later on:

{
   "type": 64,
   "timestamp": 1484849916393761,
   "session_id": 5110255302092189,
   "handle_id": 2467253471689504,
   "event": {
      "plugin": "janus.plugin.videoroom",
      "data": {
         "event": "published",
         "room": 1234,
         "id": 8359209968968477
      }
   }
}

we’d now also know that ciccio started publishing media to the room, which means other participants may subscribe to his feed(s). We know this happens when we see an event like this appearing:

{
   "type": 64,
   "timestamp": 1484849934290342,
   "session_id": 5411590629247595,
   "handle_id": 5289018133575711,
   "event": {
      "plugin": "janus.plugin.videoroom",
      "data": {
         "event": "subscribing",
         "room": 1234,
         "feed": 8359209968968477,
         "private_id": 8674238615971311
      }
   }
}

as there’s somebody subscribing to feed 8359209968968477, which as we saw before is ciccio‘s participant ID. Anyway, we don’t know who just subscribed, as the event doesn’t tell us that. This is normal and expected, in the VideoRoom plugin, as the publish/subscribe mechanism allows you to completely decouple the two in order to implement flexible application scenarios.

Should we be interested in knowing who, among the other participants, exactly subscribed to ciccio, all we need to do is mixing some correlation at the two different levels. In fact, we see the subscription comes from handle 5289018133575711. Looking at the opaque ID associated to that handle, we may find out another handle used to advertise a user joining the room pretty much as ciccio did before. For the sake of brevity, let’s assume we did find the right handle, which might have originated this join event:

{
   "type": 64,
   "timestamp": 1484849934215652,
   "session_id": 5411590629247595,
   "handle_id": 7692179108141921,
   "event": {
      "plugin": "janus.plugin.videoroom",
      "data": {
         "event": "joined",
         "room": 1234,
         "id": 2250550417606982,
         "private_id": 8674238615971311,
         "display": "pippo"
      }
   }
}

We just found out that the user who just subscribed to ciccio is the participant called pippo!

A small note.

If you looked closely enough, you might have also noticed that the events associated with pippo both contained a property called private_id, which is set to the same value. This is a plugin-specific way of doing correlation based uniquely on the information the plugin provides: in fact, the VideoRoom allows you to pass that identifier in publishing and subscribing requests originated by the same participant, which basically means it acts exactly as the opaque_id does for handles but at the plugin level. We could use that value for the correlation and get the same results (assuming the private ID info is correctly set).

In general, though, considering that not all plugins may provide correlation info on their own, the safest and easiest way to accomplish that is following the two-level correlation we briefly described before.

Now, let’s see how we can modify the silly example we created to handle events coming from the VideoRoom. We’ll definitely need a table that allows us to store information about participants, which will need to contain session/handle IDs, room and participant IDs, and display name. We’ll also need tables to track participants who start publishing and/or subscribing. Translated in SQL tables:

CREATE TABLE participants (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, session BIGINT(30) NOT NULL, handle BIGINT(30) NOT NULL, roomid BIGINT(30) NOT NULL, userid BIGINT(30) NOT NULL, displayname VARCHAR(100), timestamp datetime NOT NULL);
CREATE TABLE publishers (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, session BIGINT(30) NOT NULL, handle BIGINT(30) NOT NULL, roomid BIGINT(30) NOT NULL, userid BIGINT(30) NOT NULL, timestamp datetime NOT NULL);
CREATE TABLE subscriptions (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, session BIGINT(30) NOT NULL, handle BIGINT(30) NOT NULL, roomid BIGINT(30) NOT NULL, feed BIGINT(30) NOT NULL, timestamp datetime NOT NULL);

Once done that, we can update the node.js code to track those events. We already have a branch that handles plugin-originated events:

	[..]
	} else if(json.type === 64) {
		// Plugin event
		[..]

What we need to do is adding a further check that verifies whether or not it’s an event coming from the VideoRoom, and in case parse the ones we care about and store them in the new tables we just created:

		[..]
		// If this is a VideoRoom event, track participants, publishers and subscriptions
		if(plugin === "janus.plugin.videoroom") {
			[..]

Again, we won’t delve into the specifics of how this is implemented: you can check the attached code to see the details and play with the events yourself. Let’s instead focus on the testing part, and try a simple VideoRoom scenario. We’ll assume two different users have joined the default VideoRoom demo, so that we can look at the database to see if we can indeed correlate the info as described before.

Let’s start by checking who the participants are:

MariaDB [janusevents]> select session,handle,userid,displayname from participants where roomid=1234;
+------------------+------------------+------------------+-------------+
| session          | handle           | userid           | displayname |
+------------------+------------------+------------------+-------------+
| 3897749713876117 | 5733258150533185 | 6260186926013924 | ciccio      |
|  963819663194400 | 2786219943281760 | 6261990734727767 | pippo       |
+------------------+------------------+------------------+-------------+

We see our two famous participants, ciccio and pippo, and the session/handle they used to join the room. If we wanted to know who’s publishing in the room, we can do that by combining the participants and publishers table:

MariaDB [janusevents]> select displayname from participants u join publishers p where u.userid=p.userid;
+-------------+
| displayname |
+-------------+
| ciccio      |
| pippo       |
+-------------+

which basically tells us that both participants are contributing in the room. Now, let’s have a look at the subscriptions:

MariaDB [janusevents]> select session,handle,roomid,feed from subscriptions where roomid=1234;
+------------------+------------------+--------+------------------+
| session          | handle           | roomid | feed             |
+------------------+------------------+--------+------------------+
|  963819663194400 | 7852486861515809 |   1234 | 6260186926013924 |
| 3897749713876117 |  610753596127865 |   1234 | 6261990734727767 |
+------------------+------------------+--------+------------------+

We see that there are two subscriptions, one for feed 6260186926013924 and another for feed one for feed 6261990734727767. A quick matching of the feeds with the participants list tells us that the subscriptions are (big surprise!) for ciccio and pippo respectively:

MariaDB [janusevents]> select userid,displayname from participants u join subscriptions s where u.userid=s.feed;
+------------------+-------------+
| userid           | displayname |
+------------------+-------------+
| 6260186926013924 | ciccio      |
| 6261990734727767 | pippo       |
+------------------+-------------+

Anyway, as we discussed before, this doesn’t tell us who is subscribed to those feed. We obviously know intuitively that one is subscribed to the other, but let’s confirm that with the actual data, starting with who’s subscribed to feed 6260186926013924, that is ciccio‘s ID:

MariaDB [janusevents]> select session,handle from subscriptions where roomid=1234 and feed=6260186926013924;
+-----------------+------------------+
| session         | handle           |
+-----------------+------------------+
| 963819663194400 | 7852486861515809 |
+-----------------+------------------+

This is the session/handle couple that was used to subscribe. If we look at the opaque ID associated with that:

MariaDB [janusevents]> select session,handle,opaque from handles where session=963819663194400 and handle=7852486861515809 and opaque is not null;
+-----------------+------------------+----------------------------+
| session         | handle           | opaque                     |
+-----------------+------------------+----------------------------+
| 963819663194400 | 7852486861515809 | videoroomtest-Xf1xocSkgkzg |
+-----------------+------------------+----------------------------+

we can use it to find all handles that share it:

MariaDB [janusevents]> select session,handle from handles where opaque='videoroomtest-Xf1xocSkgkzg';
+-----------------+------------------+
| session         | handle           |
+-----------------+------------------+
| 963819663194400 | 2786219943281760 |
| 963819663194400 | 7852486861515809 |
+-----------------+------------------+

where we find out about a different handle, 2786219943281760. If we look in the participants and/or publishers table and look for that handle, this is what we find:

MariaDB [janusevents]> select userid,displayname from participants where session=963819663194400 and handle=2786219943281760;
+------------------+-------------+
| userid           | displayname |
+------------------+-------------+
| 6261990734727767 | pippo       |
+------------------+-------------+

which confirms our initial assumptions: it’s indeed pippo that subscribed to ciccio‘s streams using the provided handle. Doing the same thing on the other subscriptions (which I won’t do for brevity and leave to you as an excercise) we’ll find out about the subscription the other way around.

Now, I’m sure there are definitely more optimized ways of correlating the info in SQL than what I did in this dumb example in so many steps, but you got the juice: correlation is possible, if you know how to process those events and how they relate to one another.

 

To conclude…

If you got here, congratulations! That was a lot to type, and quite the same to read/digest as well, I’m sure…

Just as I explained in the last blog post, this is a very simple and dumb example of how you can process and correlate events. We again made use of a very basic node.js application that saves to a Database that we can then query, but way more exciting things could be done with these events for an effectively real-time monitoring and troubleshooting of complex applications. If you want to play with events and follow this demo with a more practical approach, the code is available here.

I hope this introduction and tutorial at least helped you better grasp the concept we tried to introduce in Janus with the Event Handlers, and that you’ll start playing with them more and more in the near future.

As usual, feedback is most welcome!

Update from the 24th of January: I updated the contents of the demo archive, as I was made aware the SQL scripts were missing a couple of tables. If you downloaded the archive before that and wanted to play with the examples, make sure you get the new one!

I'm getting older but, unlike whisky, I'm not getting any better