System Design Interview: Design a Real-Time Collaborative Whiteboard (Miro/Figma)

What Is a Real-Time Collaborative Whiteboard?

A collaborative whiteboard (like Miro, FigJam, or Excalidraw) lets multiple users simultaneously draw, add shapes, text, and sticky notes on an infinite canvas. All participants see changes in real time. The key challenges: synchronizing concurrent edits without conflicts, supporting undo/redo across users, handling millions of canvas objects efficiently, and scaling to thousands of concurrent boards.

  • Shopify Interview Guide
  • Netflix Interview Guide
  • Airbnb Interview Guide
  • LinkedIn Interview Guide
  • Atlassian Interview Guide
  • Meta Interview Guide
  • System Requirements

    Functional

    • Users can draw freehand, add shapes, text, images, and sticky notes
    • All changes propagate to other connected users in <100ms
    • Infinite canvas with zoom and pan
    • Undo/redo per user (not global)
    • Persistent: board state survives disconnections and server restarts
    • Cursor presence: see other users’ cursors in real time

    Non-Functional

    • Latency: <100ms for operation propagation
    • Scale: 1000 concurrent users per board (large enterprise sessions)
    • Boards can have millions of objects (large diagrams)

    Core Data Model

    boards: id, owner_id, name, created_at
    board_elements: id, board_id, type (shape/text/image/path),
                    x, y, width, height, style, content,
                    version, created_by, updated_at
    board_operations: id, board_id, user_id, op_type, element_id,
                      payload (JSON), timestamp, vector_clock
    

    Real-Time Sync: WebSocket Architecture

    Each client connects to a whiteboard server via WebSocket. When a user draws a shape:

    1. Client sends operation to the WebSocket server
    2. Server broadcasts to all other clients connected to the same board
    3. Server persists the operation asynchronously (Kafka → storage worker)

    Scaling across multiple WebSocket servers: use Redis pub/sub. Each server subscribes to channel board:{board_id}. When any server receives an operation, it publishes to Redis — all servers receive it and fan out to their connected clients. A board with 1000 users spread across 20 servers: each server handles ~50 connections; Redis delivers each message to all 20 servers; each server pushes to its 50 clients.

    Conflict Resolution: OT vs. CRDTs

    Operational Transformation (OT)

    OT transforms concurrent operations against each other to maintain consistency. If User A moves a shape to (100, 200) while User B simultaneously deletes it, OT determines the correct outcome. Used by Google Docs for text, but complex to implement correctly for arbitrary data types. Requires a central server to sequence and transform operations.

    CRDTs (Conflict-Free Replicated Data Types)

    CRDTs are data structures where all concurrent operations converge to the same result without coordination. For a whiteboard:

    • Add element: CRDT add-wins set — adds always win over concurrent deletes
    • Move element: Last-Write-Wins (LWW) on position, with timestamp or vector clock as tiebreaker
    • Freehand drawing: G-Set (grow-only) of path points — no conflict possible

    CRDTs are simpler than OT for non-text collaboration and allow peer-to-peer sync (no central coordinator needed). Excalidraw uses a CRDT-inspired approach. Miro uses server-centric OT-like sequencing.

    Element Versioning and Undo

    Each board_element has a version counter. Operations include the element’s version they were based on. If two users edit the same element concurrently:

    • First operation wins and increments version
    • Second operation is based on an outdated version — server applies it against the current version using LWW or transformation

    Per-user undo: maintain a per-user operation stack. “Undo” applies the inverse of the user’s last operation. Moving shape from A to B → undo moves it from B back to A. If another user has since moved the shape, undo reverts only that specific change without affecting subsequent edits — this requires operation inversion, not full state rollback.

    Canvas State Loading

    When a user opens a board, they need the current state of all elements. For boards with millions of objects:

    • Store a periodic snapshot of the board state (serialized JSON of all elements)
    • On load: fetch the latest snapshot + all operations since the snapshot timestamp
    • Apply operations to the snapshot to reconstruct current state
    • This bounds load time regardless of how many total operations exist in history

    Cursor Presence

    Show other users’ cursors moving in real time. Cursor positions are ephemeral — not persisted. Broadcast via the same WebSocket channel, but at a higher rate (30fps throttled). On the server side: cursor updates are fire-and-forget (no persistence, no acknowledgment). Use a separate Redis pub/sub channel board:{id}:cursors to keep cursor traffic from mixing with durable operation traffic.

    Viewport and Large Boards

    Boards can be enormous. Only load elements in or near the user’s current viewport. The server accepts a viewport bounding box and returns only elements within it. As the user pans/zooms, load new elements lazily. Use a spatial index (R-tree or quadtree) to efficiently query elements within a bounding box.

    Interview Tips

    • Distinguish OT (requires central coordinator, complex transformation logic) from CRDTs (decentralized, simpler for non-text) — knowing this shows depth.
    • Cursor presence is intentionally ephemeral — it should NOT be persisted or included in board state. Separating concerns between durable operations and ephemeral presence is a key insight.
    • Snapshot + operation log for board loading is the same event sourcing pattern used throughout distributed systems.
    • Redis pub/sub for cross-server fan-out is the standard pattern for multi-server WebSocket applications.
    Scroll to Top