System Design: Database Sharding — Horizontal Partitioning, Shard Keys, Hotspots, and Resharding

Why Sharding?

A single database server has limits: storage capacity (~tens of terabytes), write throughput (~tens of thousands of writes/second), and connection limits. When a single server cannot handle your data volume or write load, sharding (horizontal partitioning) splits the data across multiple database servers. Each server (shard) holds a subset of the rows. Combined, all shards hold the full dataset.

Contrast with vertical scaling (bigger server) and read replicas (help with reads, not writes). Sharding is the solution for write-heavy workloads that exceed a single node’s capacity.

Shard Key Selection

The shard key determines which shard a row belongs to. This is the most critical design decision. Good shard key properties: (1) High cardinality: enough distinct values to distribute data across shards evenly. A boolean column is a terrible shard key. (2) Even distribution: data should be distributed roughly equally across shards. If 90% of users are in the US and you shard by country, the US shard is a hotspot. (3) Query locality: queries that go to a single shard are fast. Cross-shard queries (scatter-gather) are expensive. Choose a shard key that co-locates related data. Common choices: user_id (user-scoped queries stay on one shard), tenant_id (multi-tenant SaaS), geographic region.

Sharding Strategies

Hash-based sharding: shard = hash(shard_key) % num_shards. Pros: uniform distribution guaranteed for high-cardinality keys. Cons: range queries scan all shards. Adding shards requires rehashing all data. Range-based sharding: shard by ranges of the key (shard 1: user_id 1-1M, shard 2: 1M-2M). Pros: range queries stay on one shard. Cons: monotonically increasing keys (auto-increment IDs) cause all writes to go to the last shard (hotspot). Directory-based sharding: a lookup table maps each shard key to a shard. Pros: flexible, can rebalance by updating the lookup table. Cons: the directory itself is a single point of failure and must be cached aggressively.

Cross-Shard Queries

Queries that need data from multiple shards require scatter-gather: send the query to all relevant shards and merge the results. Problems: (1) Latency: parallel queries to N shards, wait for the slowest. (2) JOINs across shards: not possible at the database level — must be done in application code. (3) Aggregations (COUNT, SUM): compute per-shard, aggregate in application. Design to minimize cross-shard queries: choose the shard key that makes common queries shard-local. For a social app: shard by user_id — user profile, posts, and followers stay on one shard. Global queries (trending posts) will always require cross-shard scatter-gather — use a separate aggregation layer (pre-computed summaries in Redis).

Hotspots

A hotspot is a shard receiving disproportionate traffic. Causes: (1) Popular user (a celebrity’s account receives 1000x more requests than average). (2) Sequential inserts (all new rows go to the last range-based shard). (3) Time-based queries concentrated on the current period. Mitigation: add a random suffix to the shard key for popular entities (user_123_0, user_123_1, … user_123_9 — spread across 10 shards). Scatter reads across all suffixed shards and merge. This adds complexity but eliminates the hotspot.

Resharding

When a shard fills up or receives too much traffic: split it into two shards. Strategies: (1) Consistent hashing: nodes are placed on a ring. Adding a node only requires moving a fraction of keys (1/N of total). Minimal data movement. (2) Pre-sharding: create more shards than you need initially (256 logical shards on 4 physical servers — 64 logical per server). When scaling, move logical shards without remapping data. (3) Online resharding: copy data to new shard, verify, switch traffic, delete from old. Use dual writes (write to old and new shard during migration) to avoid data loss.

Interview Tips

  • Do not shard prematurely. Start with a single well-tuned database + read replicas. Sharding adds enormous complexity (no cross-shard transactions, no cross-shard JOINs, complex resharding).
  • Consistent hashing is the standard answer for “how do you handle resharding?” It minimizes data movement when nodes are added or removed.
  • Global unique IDs across shards: auto-increment breaks with multiple shards. Use Snowflake IDs (timestamp + machine ID + sequence) or UUID instead.

Asked at: Databricks Interview Guide

Asked at: Uber Interview Guide

Asked at: Netflix Interview Guide

Asked at: Cloudflare Interview Guide

Asked at: Shopify Interview Guide

Scroll to Top