Low-Level Design: Collaborative Document Editor — Operational Transform, CRDT, and Conflict Resolution

Core Requirements

A collaborative document editor allows multiple users to edit the same document simultaneously, with changes reflected in near-real-time on all clients. Key challenges: (1) Conflict resolution — two users edit the same part of the document simultaneously. (2) Consistency — all clients converge to the same document state. (3) Offline support — users can edit while disconnected; changes sync when reconnected. (4) Latency — local edits feel instantaneous (optimistic updates); remote edits arrive with slight delay. Two main approaches: Operational Transformation (OT) — used by Google Docs; Conflict-free Replicated Data Type (CRDT) — used by Figma, Notion.

Data Model

Document: doc_id, title, owner_id, created_at, updated_at, version (monotonic counter). Operation: op_id (UUID), doc_id, user_id, op_type (INSERT, DELETE, RETAIN), position, content (for INSERT), length (for DELETE), parent_version (the document version this op was based on), applied_at. Snapshot: snapshot_id, doc_id, version, content (full document text at that version), created_at. Collaborator: user_id, doc_id, permission (VIEW, COMMENT, EDIT), cursor_position, last_seen_at. Revision history: sequence of operations that can reconstruct any historical version.

Operational Transformation

OT resolves conflicts by transforming operations against concurrent operations. Example: Document “Hello”. User A inserts ” World” at position 5. Concurrently, User B deletes “H” at position 0. Without OT: applying both as-is gives “elloHello World” (wrong). With OT: transform B’s delete against A’s insert. After A’s insert, position 5 is now “Hello World”; B’s delete at position 0 is still correct (position 0 didn’t shift). Server applies all operations sequentially: every client sends operations with a base version. Server transforms the operation against all operations applied since that version, then applies the transformed operation. Clients apply server-confirmed operations. The server is the arbiter: it defines the canonical order.

def transform_insert_against_insert(op1, op2):
    # op1: INSERT at pos1, op2: INSERT at pos2 (concurrent)
    # Transform op1 to account for op2 being applied first
    if op2.position <= op1.position:
        op1.position += len(op2.content)
    return op1

def transform_insert_against_delete(op1, op2):
    # op1: INSERT at pos1, op2: DELETE at pos2 length len2
    if op2.position < op1.position:
        op1.position = max(op2.position,
                           op1.position - op2.length)
    return op1

CRDT Alternative

CRDTs (Conflict-free Replicated Data Types) are data structures designed to be merged without conflicts. For text editing: Logoot or LSEQ assigns each character a unique fractional position (e.g., between position 1 and 2, insert at 1.5; between 1.5 and 2, insert at 1.75). Positions are globally unique and totally ordered. Two clients inserting at the same position get different fractional positions — no conflicts. Delete = mark character as tombstone (don’t remove from position list immediately). Merge = union of all character sets, sort by position. CRDT pros: no central server needed for conflict resolution (works peer-to-peer), true offline support, simpler convergence guarantees. CRDT cons: document representation grows (tombstones accumulate), positions can become very long (precision needed for deeply nested inserts), harder to implement correctly. Modern systems (Yjs, Automerge) use efficient CRDT implementations that compact tombstones periodically.

Real-Time Synchronization

WebSocket connection per active collaborator. Server maintains a presence map: doc_id → {user_id → {cursor_position, last_active}}. On edit: client sends operation over WebSocket → server applies OT transform → broadcasts transformed operation to all other clients in the document room. Cursor broadcast: send cursor position updates at 100ms intervals (throttled). Show collaborator cursors in the editor UI with their name and color. Reconnection: client stores a local operation queue. On reconnect, send all pending operations with the last acknowledged version. Server replays missed operations since that version. Snapshot compaction: after N operations (e.g., 1000), create a snapshot of the full document text. This bounds the replay cost on reconnect and the storage cost of operation history.

Asked at: Atlassian Interview Guide

Asked at: Meta Interview Guide

Asked at: LinkedIn Interview Guide

Asked at: Cloudflare Interview Guide

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

Scroll to Top