Question 1

What is the role of STUN and TURN in WebRTC?

Accepted Answer

STUN (Session Traversal Utilities for NAT) lets a client discover its public IP:port by querying a STUN server, which reflects the observed address back. This works for most NAT types (full cone, address-restricted). TURN (Traversal Using Relays around NAT) is a fallback: when P2P fails (symmetric NAT, corporate firewalls), all media is relayed through the TURN server. TURN is expensive in bandwidth but essential for reliability -- approximately 15-20% of WebRTC calls require TURN. ICE (Interactive Connectivity Establishment) gathers all candidate types (host, STUN-discovered, TURN-relayed) and tests them in priority order, selecting the best working path.

Question 2

What is an SFU and why is it preferred for group video calls?

Accepted Answer

An SFU (Selective Forwarding Unit) is a media server that receives one video stream from each participant and selectively forwards the appropriate streams to each other participant, without decoding or re-encoding. Each client uploads once; the SFU distributes. This scales to dozens of participants while keeping server CPU low (no transcoding). The SFU supports simulcast: clients upload multiple quality layers (360p, 720p, 1080p) simultaneously; the SFU forwards the appropriate quality to each subscriber based on their available bandwidth. Zoom, Google Meet, and Discord all use SFU architecture. The alternative, mesh (P2P), requires each client to upload N-1 streams -- impractical beyond 4-5 participants.

Question 3

How does WebRTC signaling work?

Accepted Answer

WebRTC signaling exchanges the information two peers need to establish a connection: SDP (Session Description Protocol) offers and answers (codec negotiation, media direction, ICE credentials) and ICE candidates (network addresses). The signaling protocol is not defined by WebRTC -- you implement it, typically using WebSockets. Flow: Caller creates an SDP offer and sends it via the signaling server to the Callee. Callee creates an SDP answer and sends it back. Both sides exchange ICE candidates as they are discovered (trickle ICE). Once the P2P connection is established, all media flows directly between peers -- the signaling server is no longer involved. The signaling server only handles call setup messages.

Question 4

How do you handle WebRTC quality adaptation for variable bandwidth?

Accepted Answer

WebRTC uses RTCP feedback for congestion control. Two main mechanisms: REMB (Receiver Estimated Maximum Bitrate) and Transport-CC (transport-wide congestion control). When bandwidth drops, the sender reduces bitrate by lowering resolution or frame rate. Simulcast is the best solution for SFU architectures: the sender uploads 3 quality layers simultaneously (e.g., 180p/360p/720p). The SFU monitors each subscriber's available bandwidth and switches which layer to forward, without requiring any encoder changes on the sender side. This gives instant, seamless quality switching. FEC (Forward Error Correction) adds redundancy to audio packets so loss can be recovered without retransmission -- critical for real-time audio where retransmission would arrive too late.

Question 5

How do you record a WebRTC call?

Accepted Answer

Two approaches: (1) Client-side recording: use the MediaRecorder API to record the local streams on the sender's device. Simple but misses remote streams and depends on client device reliability. (2) Server-side recording via SFU: tap the SFU to receive all streams. Run a recording bot that subscribes to all streams as a phantom participant. Decode and composite the video grid using FFmpeg. Mux audio and video into an MP4 or WebM container. Upload to object storage (S3) when the call ends. Server-side recording is reliable and captures all participants. Store the raw stream segments during the call (in case of failure) and composite after. Send a signed S3 URL to participants when the recording is ready.

System Design: WebRTC and Video Calling — Signaling, ICE, STUN/TURN, and SFU Architecture

WebRTC Fundamentals

Signaling

NAT Traversal: STUN and TURN

Topology: Mesh vs SFU vs MCU

SFU Architecture for Scale

Quality Adaptation

Interview Tips