Today I'll show you how to implement a video call app in less than an hour using WebRTC. If you just want to see the code, here's the repo.
What's WebRTC?
RTC stands for Real-Time Communication. It is a set of APIs provided by the browser that allow you to create a peer-to-peer (P2P) connection. It is a direct connection between two devices, which means that your private data won't go through a third-party server.
What do I need?
In order to follow this lesson, you'll need some programming experience.
- For the front-end some familiarity with Javascript and HTML is required.
- For the server, you'll benefit from any experience using express and socket.io.
Setup the initial project structure
To save us some time I've created a repo with an
express
+ socket.io
server, the HTML for our Zoom clone and some base styles.Go ahead and clone it:
git clone -b initial git@github.com:ngonzalvez/aloha.git
and install the dependencies:
Here's what you'll find after cloning the repo:
public/main.js
Front-end logic.public/styles.css
CSS styles for the app.views/room.ejs
The template for the main UI.server.ts
The server code.
Let's define some routes
For our web-app to work we need to add two routes to
server.ts
:/
It simply redirects you to a new room./room/:roomId
The meeting room.
To define these routes, add the following code to
server.ts
:import { v4 as uuidv4 } from "uuid";
app.get("/", (_, res) => {
// Redirect the user to a new room.
res.redirect(`/room/${uuidv4()}`);
});
app.get("/room/:roomId", (req, res) =>
// Render the room template.
res.render("room", { roomId: req.params.roomId })
);
If you run
npm start
and navigate to localhost:3000
, you should be redirected to a new room. Use the room link as an invitation link to our meeting.The User Interface
If you check
views/room.ejs
you'll see that the UI couldn't be simpler. It looks something like this:<video id="active-participant" autoplay></video>
<div id="participants"></div>
Most of the UI is a big video player that will show the active participant. Set the active video stream using the
setActiveParticipantStream(stream)
function provided in public/main.js
.There's also a small container at the bottom. When a new participant joins the meeting, we will add his video stream to the
participants
container using the addParticipantStream(stream, peerId)
function. A participant can be set as the active participant just by clicking on his video stream.Lights, camera, action!
Before we implement the P2P connection, we need to get access to our camera and microphone.
When the DOM finishes loading, the
init()
function is called. In it, we will need to ask for permission to access the camera stream, show it in the main video player and add the video stream to the participants
container.The code looks something like this:
let localStream = null;
async function init() {
// Get access to our local video stream.
localStream = await navigator.mediaDevices.getUserMedia({
video: true,
audio: true,
});
// Show it in the main player.
setActiveParticipantStream(localStream);
// Add the stream to the `participants` container.
addParticipantStream(localStream, "me");
}
window.onload = init;
Note that we store the local video stream in a global variable, so that it is accessible from other functions as well.
Go ahead and try it. Refresh the page and you should be able to see yourself.
How does WebRTC work?
Creating P2P connections with WebRTC is quite straight forward. One peer creates a connection offer and sends it, the other peer replies with an answer, they exchange ICE candidates which are instructions on how to get from one peer to the other through the internet, and voilá, WebRTC does the rest.
There's just one issue, we need a way for the peers to send the offer and answer to each other. That's where the signaling server comes into play. The signaling server is simply a server that both peers connect to and it relays messages (signals).
Signaling Server
We'll implement the signaling server using WebSocket. If you go to
server.ts
you'll find that I've already included an empty WebSocket server. Now, let's add some logic to it.Joining the room
When a user wants to join a room, he will send a
JoinRoom
message to the signaling server including the roomId
. Then the signaling server will notify everyone in the room about the new user using the UserJoined
message, including the peerId
. Then it will add the user to the room.socket.on("JoinRoom", (roomId: string) => {
// Notify everyone in the room.
socket.to(roomId).emit("UserJoined", socket.id);
// Add the user to the room.
socket.join(roomId);
});
Leaving the room
Let's also notify the room when a user disconnects. We need to know the
roomId
for that, so make sure to add this handler inside the JoinRoom
handler.socket.on("disconnect", () => {
socket.to(roomId).emit("UserLeft", socket.id);
});
Sending signals
Every time a peer wants to send an offer, answer or ICE candidate to the other peer, it will send a
Signal
including the destination peerId
and the message
itself. The signaling server will then relay the message to the other peer by sending a Signal
message to the other peer, including the peerId
of the user sending the message.socket.on("Signal", (peerId: string, message: any) => {
// Relay the message to the other peer.
socket.to(peerId).emit("Signal", socket.id, message);
});
And that's all there is to the signaling server.
Creating a WebRTC connection
In
main.js
we are going to define a createPeerConnection
function that will encapsulate all the logic behind creating a P2P connection. It will receive up to three parameters:peerId
the ID of the peer it is connecting to.type
one of offer
or answer
.receivedOffer
only used when creating an answer
.
Create the RTC connection
The first thing it will do is to create a
RTCPeerConnection
instance. We need to specify which ICE servers (the ones helping us find a connection path between the peers) we are going to use. In this case, we are just going to use Google's ICE server.// peerId ➡ connection map.
const peers = {};
async function createPeerConnection(peerId, type, remoteOffer) {
const conn = (peers[peerId] = new RTCPeerConnection({
iceServers: [{ urls: "stun:stun.l.google.com:19302" }],
}));
// Setup the connection here...
}
Now that we have a connection instance, I'll work you through the steps to set it up.
Send ICE candidates
When WebRTC discovers possible ICE candidates (paths from the internet to the peer), it needs to send them to the other peer, so that they are able to connect.
conn.onicecandidate = ({ candidate }) => {
if (candidate) {
// Send it to the other peer.
socket.emit('Signal', peerId, { candidate});
}
};
Handle incoming audio/video tracks
When we receive media tracks, we need to create a
MediaStream
and attach the tracks to it. Once we have the stream, we can display it in the UI using addPariticipantStream(stream, peerId)
.// Create a multimedia stream.
const remoteStream = new MediaStream();
conn.ontrack = (event) => {
// Get the stream from the `ontrack` event.
const [stream] = event.streams;
// Add the received audio/video track to the stream.
stream.getTracks().forEach(track => {
remoteStream.addTrack(track);
});
};
// Display the remote stream in the UI.
addParticipantStream(remoteStream, peerId);
Send the local camera stream
In order for the other peer to receive our camera stream, we need to attach its audio and video tracks to the connection.
// Attach the media tracks from the camera stream
// to the connection.
localStream.getTracks().forEach(track => {
conn.addTrack(track, localStream);
});
Creating an offer/answer
If this peer is the one initiating the connection, we need to create an offer, otherwise, we'll create an answer. Remember that the function signature includes a
type
parameter indicating whether it is an offer
or an answer
.// If there is an offer from the remote peer, set it in the connection.
if (remoteOffer) conn.setRemoteDescription(remoteOffer);
// Create the offer/answer.
const localDescription = type === "offer"
? await conn.createOffer()
: await conn.createAnswer();
// Set the local offer/answer in the connection.
conn.setLocalDescription(localDescription);
// Send it to the other peer.
socket.emit(
"Signal",
peerId,
{ [type]: localDescription },
);
Now you have completed the implementation of the
createPeerConection
function.Connecting the peers
The final step is to write the code to connect the peers using the signaling server we just implemented. The flow is as follows.
Joining a room
Once the app loaded, we'll send the
JoinRoom
message including the roomId
. Simply add the following code to the init()
function:socket.emit("JoinRoom", roomId);
The signaling server will then send a
UserJoined
message to everyone in the same room, including this peer's id.When a user joined
If a user joined, we need to send a connection offer using
createPeerConnection(peerId, 'offer')
.socket.on("UserJoined", async (peerId) => {
await createPeerConnection(peerId, "offer");
});
When a Signal is received
We need to reply every
Signal
containing a connection offer, with another Signal
containing an answer.socket.on("Signal", async (peerId, message) => {
if (message.offer) {
await createPeerConnection(peerId, "answer", message.offer);
}
});
But a signal may also contain an
answer
or an offer, in that case we need to set them in the connection. So, add the following code to the Signal
handler.if (message.answer) {
peers[peerId].setRemoteDescription(message.answer);
}
if (message.candidate) {
peers[peerId].addIceCandidate(message.candidate);
}
When a user leaves
Finally, we need to handle the case when a user leaves. We simply close the connection, delete the connection from the map and remove the video stream from the UI.
socket.on("UserLeft", (peerId) => {
// Close the connection.
peers[peerId].close();
delete peers[peerId];
// Remove the video stream from the UI.
document.querySelector(`[data-peer-id="${peerId}"]`).remove();
});
And that's it. We have implemented our very own Zoom clone using WebRTC.