System Design Interview: Design a Distributed Message Queue (Kafka)

System Design: Distributed Message Queue (Kafka)

A distributed message queue decouples producers (services that generate events) from consumers (services that process them), enabling async communication, backpressure handling, and replay. Apache Kafka is the de facto standard — it handles trillions of messages per day at LinkedIn, Netflix, and Uber. This guide covers the design of a Kafka-like system from the ground up.

Requirements

Functional: Producers publish messages to named topics. Consumers subscribe to topics and receive messages. Messages are retained for configurable duration (7 days default). Messages can be replayed from any offset. At-least-once delivery guarantee.

Non-functional: 1 million messages/sec write throughput, sub-10ms publish latency, horizontal scalability, fault tolerance (survive broker failures), ordered messages within a partition.

Core Concepts

  • Topic: Named stream of messages (e.g., “user.clicks”, “order.created”)
  • Partition: A topic is divided into N ordered, immutable logs. Messages within a partition are totally ordered. Partitions enable parallelism.
  • Offset: Sequential integer assigned to each message within a partition. Consumers track their position via offset.
  • Producer: Writes messages to a partition (by key hash or round-robin). Receives ACK after message is durably written.
  • Consumer Group: A group of consumers sharing a topic subscription. Each partition is assigned to exactly one consumer in the group. This gives parallel consumption without duplicate processing.
  • Broker: A server that stores partitions and serves producers/consumers. Kafka cluster = multiple brokers.

Data Model

Topic:     topic_name, num_partitions, retention_bytes, retention_ms
Partition: topic_name, partition_id, leader_broker_id, replica_broker_ids
Message:   offset, key, value (bytes), timestamp, headers
Offset:    consumer_group, topic, partition_id, committed_offset

Write Path

Producer → Partition Selection → Leader Broker → Write to Segment File
                                                       ↓
                                             Replicate to Follower Brokers
                                                       ↓
                                             ACK to Producer (after N replicas confirm)

Partition selection: If message has a key: partition = hash(key) % num_partitions — same key always goes to same partition (ordering guarantee for related messages, e.g., all events for user_id=123). If no key: round-robin across partitions (maximizes throughput).

Durability: Producer acks setting (Kafka’s acks config): acks=0 — fire and forget (fastest, can lose messages); acks=1 — leader acknowledges (can lose if leader fails before replicating); acks=all — all in-sync replicas acknowledge (no data loss, highest latency).

Storage: Each partition is a sequence of segment files (each ~1 GB). New messages are appended to the active segment. Sequential writes to disk are extremely fast (500 MB/s vs 100 MB/s random). Retention: old segments are deleted when size > retention_bytes or age > retention_ms.

Read Path

Consumer → Poll(topic, partition, offset) → Broker → Return messages from offset
                ↓
        Process messages
                ↓
        Commit offset (mark progress)

Zero-copy read: Kafka uses Linux sendfile() to transfer data from disk directly to the network socket, bypassing application memory. This is 4–6x faster than read-then-write.

Consumer group rebalancing: When a consumer joins or leaves, the group coordinator (a special broker) triggers a rebalance — reassigns partitions among the active consumers. During rebalancing, consumption pauses briefly. New consumers get partitions at the last committed offset.

Delivery Guarantees

Guarantee How Tradeoff
At-most-once Commit offset before processing May miss messages on crash
At-least-once Commit offset after successful processing May reprocess on crash (duplicates possible)
Exactly-once Idempotent producer + transactional consumer Higher latency, more complex

At-least-once is the industry default. Handle idempotency in the consumer: track processed message IDs in Redis or DB, skip duplicates on reprocessing.

Replication and Fault Tolerance

Each partition has a leader and N-1 followers (replicas). Producers and consumers only talk to the leader. Followers pull from the leader continuously. In-Sync Replicas (ISR) = followers that are caught up within a threshold. If the leader fails, ZooKeeper (or Kafka’s built-in Raft in newer versions) elects a new leader from the ISR. With replication factor 3 and min-ISR=2: the cluster survives 1 broker failure without data loss.

Scale Numbers

  • LinkedIn runs Kafka at ~7 trillion messages/day (80M messages/sec peak)
  • Single broker handles 1–5 GB/s throughput with SSDs
  • 1M messages/sec with 1KB messages = 1 GB/s. Requires ~10 brokers with SSD.
  • Consumer throughput limited by processing speed, not Kafka (Kafka can replay as fast as disk allows)

Interview Tips

  • Explain partitions as the unit of parallelism — more partitions = more consumer group parallelism.
  • Mention that partition ordering is guaranteed; topic-level ordering is not (messages across partitions are interleaved).
  • At-least-once vs exactly-once: at-least-once with consumer-side idempotency is the pragmatic choice.
  • Contrast with a traditional queue (RabbitMQ): Kafka retains messages and allows replay; RabbitMQ deletes after consumption. Use Kafka when you need event sourcing, audit trails, or stream processing.
  • Use cases: event streaming, log aggregation, change data capture (CDC), decoupling microservices.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How does Kafka guarantee message ordering?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Kafka guarantees ordering within a partition but NOT across partitions. Messages in a single partition are appended in order and delivered to consumers in the same order — this is an immutable log guarantee. Messages across different partitions of the same topic can be interleaved in any order. To guarantee ordering for related messages (e.g., all events for a specific user or order), use a partition key: Kafka hashes the key and routes all messages with the same key to the same partition. Example: producer.send(topic, key=user_id, value=event). All events for user_id=123 land on the same partition and are processed in order. Tradeoff: using a key concentrates traffic — if one user generates significantly more events (hot partition), one consumer gets overloaded. Use consistent hashing with many partitions (100+) to spread load while preserving per-key ordering.”}},{“@type”:”Question”,”name”:”What is a Kafka consumer group and how does it enable parallel consumption?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A consumer group is a set of consumers that jointly consume a topic. Kafka assigns each partition to exactly one consumer in the group — two consumers in the same group never process the same partition simultaneously. This provides parallel consumption without duplicate processing: if a topic has 12 partitions and 4 consumers in a group, each consumer processes 3 partitions. Adding consumers increases parallelism up to the number of partitions — a 13th consumer in a 12-partition topic would be idle. Consumers in different groups each get all messages independently — one group for real-time processing, another for archiving, another for analytics, all consuming the same topic without coordination. Offset tracking is per consumer group: each group maintains its own offset per partition, so progress of one group doesn’t affect others. Group coordinator (a broker) handles rebalancing when consumers join/leave.”}},{“@type”:”Question”,”name”:”What is the difference between at-least-once and exactly-once delivery in Kafka?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”At-least-once: commit the offset AFTER successfully processing the message. If the consumer crashes after processing but before committing, it re-reads and reprocesses the message on restart — duplicate processing is possible. This is the default and most common mode. Handle idempotency in the consumer: check if the message_id was already processed (using Redis SET or a DB unique constraint) before applying changes. Exactly-once: Kafka provides exactly-once semantics (EOS) using idempotent producers (each message gets a sequence number; broker deduplicates retries) + transactional APIs (atomically write to multiple topics and commit offsets in one transaction). Requires acks=all, enable.idempotence=true, and transactional.id configured. Adds ~20-30% latency overhead. Use exactly-once when: financial transactions, inventory updates, or any operation where duplicates cause correctness issues that cannot be handled by idempotency checks.”}}]}

🏢 Asked at: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

🏢 Asked at: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering

🏢 Asked at: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture

🏢 Asked at: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale

🏢 Asked at: Cloudflare Interview Guide

Scroll to Top