What Is Collaborative Document Editing?
Real-time collaborative editing allows multiple users to simultaneously edit the same document, with changes from each user appearing on all other users’ screens within milliseconds. Google Docs, Notion, and Figma are examples. The core engineering challenge: how do you reconcile concurrent edits that conflict with each other without losing any user’s work?
The Concurrency Problem
User A types “Hello” at position 0. Simultaneously, User B types “World” at position 0. When A’s edit arrives at B’s client (and vice versa), the naive application produces garbled text. Two solutions dominate: Operational Transformation (OT) and Conflict-free Replicated Data Types (CRDTs).
Operational Transformation (OT)
Every edit is expressed as an operation: INSERT(pos, char) or DELETE(pos). When two concurrent operations arrive, they must be transformed against each other to account for the offset changes the other operation introduces.
Example:
Doc: "AC"
Op1 (A): INSERT(1, "B") → "ABC"
Op2 (B): INSERT(1, "X") → "AXC"
At server: receive Op1 first. Transform Op2 against Op1:
Op2 is at position 1; Op1 inserted at position 1 (same or before), so Op2 shifts to position 2.
Op2' = INSERT(2, "X")
Apply Op1 then Op2': "AC" → "ABC" → "ABXC" ✓
Both clients converge to "ABXC".
OT requires a central server to serialize and transform concurrent operations. Complexity grows with the number of concurrent users and complex operations (multi-char inserts, undo). Google Docs uses OT with a central server.
CRDTs (Conflict-free Replicated Data Types)
CRDTs encode ordering information within operations, eliminating the need for transformation. Each character gets a unique identifier (a logical timestamp or fractional index) that determines its position regardless of insertion order. Deletions mark characters as “tombstoned” rather than removing them. Any order of applying operations converges to the same result — no central server required for convergence. Used by: Figma, Linear, and peer-to-peer editors. Trade-off: CRDTs accumulate tombstones and metadata, increasing document size over time (requires periodic compaction).
System Architecture (OT-based, Google Docs style)
Client → WebSocket → Collaboration Server (OT engine) → PostgreSQL (doc state + op log)
→ Pub/Sub → Other Clients
- Client generates operation, sends to server with current revision number
- Server locks the document (optimistic concurrency), transforms incoming op against any ops committed since that revision
- Applies transformed op, increments revision, persists to DB
- Broadcasts operation to all other connected clients via pub/sub (Redis)
- Clients apply the broadcast operation to their local doc
Presence and Cursor Sharing
Show where each user’s cursor is in real time. Separate low-priority channel from document operations. Cursor positions are transient — stored in Redis with short TTL (5 seconds), not in the persistent document store. Update cursor via WebSocket heartbeat. If a client disconnects, their cursor disappears when the Redis key expires.
Offline Editing and Sync
Users edit while offline (airplane mode). On reconnect: client sends all buffered operations with the revision at which they were made. Server transforms and applies in sequence. Conflict resolution: OT handles this deterministically — offline edits are merged correctly as long as the operation history is preserved. Large divergence (many offline edits): server may need to transform against 100+ concurrent ops — bounded by offline duration and edit rate.
Persistent Storage
Store the full operation log (not just the current document state). Current state = apply all ops from revision 0. Benefits: undo history (up to any revision), version history (show doc at any point in time), audit log. Snapshot periodically (every 1000 ops) to avoid replaying from revision 0 on every load.
Interview Tips
- OT vs. CRDT: OT needs central server for correctness; CRDTs are decentralized but accumulate metadata.
- Operation log is the source of truth; current state is derived — enables undo and version history “for free.”
- Cursor sharing is presence, not document state — different latency and persistence requirements.
- Snapshots prevent O(N) replay on every document load — periodic snapshot + apply subsequent ops.