The Problem: Concurrent Edits
In real-time collaborative editing (Google Docs, Notion, Figma), multiple users edit the same document simultaneously. The core challenge: User A inserts “X” at position 5 while User B simultaneously deletes the character at position 7. By the time B’s operation reaches A’s machine, A’s insert has shifted all positions by 1 — B’s operation now targets the wrong position. Naive last-write-wins produces corrupted documents.
Approach 1: Operational Transformation (OT)
OT transforms operations relative to each other before applying. When operation B arrives after A was already applied, B is transformed to account for A’s effect on the document positions.
class Op:
INSERT = 'insert'
DELETE = 'delete'
def transform(op_a, op_b):
"""
Transform op_b assuming op_a was already applied.
Returns transformed op_b that is valid after op_a.
"""
if op_a.type == Op.INSERT and op_b.type == Op.INSERT:
if op_a.pos <= op_b.pos:
return Op(Op.INSERT, op_b.pos + 1, op_b.char) # shift right
return op_b
if op_a.type == Op.INSERT and op_b.type == Op.DELETE:
if op_a.pos <= op_b.pos:
return Op(Op.DELETE, op_b.pos + 1) # shift right
return op_b
if op_a.type == Op.DELETE and op_b.type == Op.INSERT:
if op_a.pos < op_b.pos:
return Op(Op.INSERT, op_b.pos - 1, op_b.char) # shift left
return op_b
if op_a.type == Op.DELETE and op_b.type == Op.DELETE:
if op_a.pos == op_b.pos:
return None # already deleted, no-op
if op_a.pos < op_b.pos:
return Op(Op.DELETE, op_b.pos - 1) # shift left
return op_b
OT requires a central server to serialize operations and track the full history of transformations. Used by Google Docs (C++ implementation called “wave”) and Apache Wave.
Approach 2: CRDTs (Conflict-free Replicated Data Types)
CRDTs are data structures that can be merged automatically without conflicts. For text editing, the key insight: instead of tracking positions (which shift), give every character a unique, stable identity and a fractional position between its neighbors.
Logoot / LSEQ: Fractional Indexing
import uuid
class Character:
def __init__(self, value: str, position: tuple, site_id: str, clock: int):
self.id = (position, site_id, clock) # globally unique identifier
self.value = value
self.position = position # fractional position tuple
class CRDTDocument:
def __init__(self):
# Sorted by position — begin and end sentinels
self.chars = []
def insert_between(self, left_char, right_char, value: str, site_id: str, clock: int):
new_pos = self._generate_position(
left_char.position if left_char else (),
right_char.position if right_char else (float('inf'),)
)
char = Character(value, new_pos, site_id, clock)
self._insert_sorted(char)
return char
def delete(self, char_id):
# Mark as tombstone (soft delete) — never remove from the sequence
# Tombstones preserve position ordering without shifting
char = self._find(char_id)
if char:
char.deleted = True
def get_text(self) -> str:
return ''.join(c.value for c in self.chars if not getattr(c, 'deleted', False))
Architecture
Client-Server with WebSockets
Client A Server Client B
| | |
|--[op: insert X @5]-->| |
| |--[broadcast op_a]------>|
| | (transform for B) |
|<-[ack: version 42]---| |
| |<--[op: delete @7]-------|
|<-[broadcast op_b]----| |
| (transform for A) | |
Document Session Service
class DocumentSession:
def __init__(self, doc_id: str):
self.doc_id = doc_id
self.history = [] # ordered operation log
self.version = 0
self.connected_users = {} # user_id -> websocket
def apply_operation(self, op, client_version: int, user_id: str):
# Transform op against all operations since client_version
concurrent_ops = self.history[client_version:]
transformed_op = op
for server_op in concurrent_ops:
transformed_op = transform(transformed_op, server_op)
if transformed_op:
self.history.append(transformed_op)
self.version += 1
self._broadcast(transformed_op, exclude=user_id)
return self.version
def _broadcast(self, op, exclude: str):
for uid, ws in self.connected_users.items():
if uid != exclude:
ws.send(op.serialize())
Presence and Cursor Sharing
Show where each collaborator’s cursor is in real time. Each client broadcasts cursor position on every keystroke. The server relays cursor positions to all other connected clients. Cursor positions are ephemeral — no persistence needed. Store in Redis: HSET doc:{id}:cursors {user_id} {position}, expire with user’s WebSocket TTL. Clients display colored cursors with user names at the broadcast positions.
Persistence and Snapshots
- Operation log: all operations persisted to a database (immutable append-only). Enables replay from any version.
- Snapshots: periodically checkpoint the document state (every N operations or every 5 minutes). On load, apply snapshot + subsequent operations instead of replaying all history. Reduces load time from O(history) to O(recent_ops).
- Compression: consecutive single-character inserts can be merged into a single insert operation for a string. Run a compaction job offline on the operation log.
OT vs CRDT Trade-offs
| Property | OT | CRDT (Logoot/Yjs) |
|---|---|---|
| Central server | Required (serializes ops) | Optional (P2P possible) |
| Memory | Lower (no per-char metadata) | Higher (tombstones + IDs) |
| Complexity | High (transform functions for each op pair) | Moderate (merge is automatic) |
| Offline support | Limited | Excellent (merge on reconnect) |
| Used by | Google Docs, Etherpad | Notion, Figma, VS Code Live Share (Yjs) |
Interview Questions
Q: Why can’t you use last-write-wins for collaborative editing?
Last-write-wins discards concurrent edits based on timestamp. If User A and User B both edit the same paragraph simultaneously, one edit is simply overwritten. Users lose work without warning. The correct approach (OT or CRDT) preserves all edits by transforming them to be compatible, so both edits are reflected in the final document.
Q: How does Google Docs handle offline editing?
Google Docs stores operations locally when offline. On reconnect, the local operations are sent to the server with the version number at which the client went offline. The server transforms each local operation against all concurrent server operations (using OT), applies the transformed operations, and returns the transformed server ops to the client. The client then fast-forwards its document to the server state. This is the same OT algorithm used for online concurrent editing.
Asked at: Atlassian Interview Guide
Asked at: LinkedIn Interview Guide
Asked at: Meta Interview Guide
Asked at: Databricks Interview Guide
Asked at: Twitter/X Interview Guide