Why Sharding?
A single database server has limits: storage capacity (~tens of terabytes), write throughput (~tens of thousands of writes/second), and connection limits. When a single server cannot handle your data volume or write load, sharding (horizontal partitioning) splits the data across multiple database servers. Each server (shard) holds a subset of the rows. Combined, all shards hold the full dataset.
Contrast with vertical scaling (bigger server) and read replicas (help with reads, not writes). Sharding is the solution for write-heavy workloads that exceed a single node’s capacity.
Shard Key Selection
The shard key determines which shard a row belongs to. This is the most critical design decision. Good shard key properties: (1) High cardinality: enough distinct values to distribute data across shards evenly. A boolean column is a terrible shard key. (2) Even distribution: data should be distributed roughly equally across shards. If 90% of users are in the US and you shard by country, the US shard is a hotspot. (3) Query locality: queries that go to a single shard are fast. Cross-shard queries (scatter-gather) are expensive. Choose a shard key that co-locates related data. Common choices: user_id (user-scoped queries stay on one shard), tenant_id (multi-tenant SaaS), geographic region.
Sharding Strategies
Hash-based sharding: shard = hash(shard_key) % num_shards. Pros: uniform distribution guaranteed for high-cardinality keys. Cons: range queries scan all shards. Adding shards requires rehashing all data. Range-based sharding: shard by ranges of the key (shard 1: user_id 1-1M, shard 2: 1M-2M). Pros: range queries stay on one shard. Cons: monotonically increasing keys (auto-increment IDs) cause all writes to go to the last shard (hotspot). Directory-based sharding: a lookup table maps each shard key to a shard. Pros: flexible, can rebalance by updating the lookup table. Cons: the directory itself is a single point of failure and must be cached aggressively.
Cross-Shard Queries
Queries that need data from multiple shards require scatter-gather: send the query to all relevant shards and merge the results. Problems: (1) Latency: parallel queries to N shards, wait for the slowest. (2) JOINs across shards: not possible at the database level — must be done in application code. (3) Aggregations (COUNT, SUM): compute per-shard, aggregate in application. Design to minimize cross-shard queries: choose the shard key that makes common queries shard-local. For a social app: shard by user_id — user profile, posts, and followers stay on one shard. Global queries (trending posts) will always require cross-shard scatter-gather — use a separate aggregation layer (pre-computed summaries in Redis).
Hotspots
A hotspot is a shard receiving disproportionate traffic. Causes: (1) Popular user (a celebrity’s account receives 1000x more requests than average). (2) Sequential inserts (all new rows go to the last range-based shard). (3) Time-based queries concentrated on the current period. Mitigation: add a random suffix to the shard key for popular entities (user_123_0, user_123_1, … user_123_9 — spread across 10 shards). Scatter reads across all suffixed shards and merge. This adds complexity but eliminates the hotspot.
Resharding
When a shard fills up or receives too much traffic: split it into two shards. Strategies: (1) Consistent hashing: nodes are placed on a ring. Adding a node only requires moving a fraction of keys (1/N of total). Minimal data movement. (2) Pre-sharding: create more shards than you need initially (256 logical shards on 4 physical servers — 64 logical per server). When scaling, move logical shards without remapping data. (3) Online resharding: copy data to new shard, verify, switch traffic, delete from old. Use dual writes (write to old and new shard during migration) to avoid data loss.
Interview Tips
- Do not shard prematurely. Start with a single well-tuned database + read replicas. Sharding adds enormous complexity (no cross-shard transactions, no cross-shard JOINs, complex resharding).
- Consistent hashing is the standard answer for “how do you handle resharding?” It minimizes data movement when nodes are added or removed.
- Global unique IDs across shards: auto-increment breaks with multiple shards. Use Snowflake IDs (timestamp + machine ID + sequence) or UUID instead.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How do you choose the right shard key for a social media application?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “For a social media app, most queries are user-scoped: get my posts, my followers, my feed. The natural shard key is user_id. This co-locates all data for a user on one shard: profile, posts, follows, likes. Queries for a user’s own data never cross shards. Trade-off: queries across users (trending posts, mutual friend suggestions) require scatter-gather. Hot user problem: a celebrity with 100M followers receives far more traffic than average. Mitigation: split hot users across shards by appending a random suffix (celebrity_123_shard_0 through _9), spreading their data across 10 shards. Reads scatter across all 10 and merge. Follow relationships between different users do go cross-shard, but this is acceptable if kept in a separate graph store (Neo4j, or a dedicated follows table sharded differently).”
}
},
{
“@type”: “Question”,
“name”: “How does consistent hashing minimize data movement when adding shards?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Naive hash sharding (hash % N): adding one shard changes N, remapping almost all keys to different shards. Consistent hashing: place shard servers on a ring of hash values (0 to 2^32). Each key is hashed to a position on the ring and assigned to the next clockwise server. Adding a server: only the keys between the new server and its predecessor are remapped — roughly 1/N of all keys, not all of them. Virtual nodes: each physical server occupies multiple positions on the ring (100-200 virtual nodes per server). This distributes load more evenly and reduces the variance in key distribution. Used by: Cassandra, DynamoDB, Amazon S3. The key insight: consistent hashing minimizes data migration to O(K/N) keys when adding one of N nodes, where K = total keys.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle cross-shard transactions?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Cross-shard transactions (e.g., transferring money between two users on different shards) cannot use a single ACID database transaction. Options: (1) Two-Phase Commit (2PC): coordinator asks all shards to prepare, then commit. Provides ACID guarantees but slow (two round trips), and a coordinator crash can leave shards locked indefinitely. (2) Saga pattern: decompose the transaction into a sequence of local transactions, each with a compensating action. If step 2 fails: run compensating actions for step 1 to undo. Eventually consistent — there is a window where the system is in a partially applied state. (3) Avoid cross-shard transactions by design: place related data on the same shard. For money transfers between users on different shards: use an intermediate ledger service that handles the cross-shard coordination.”
}
},
{
“@type”: “Question”,
“name”: “What is the difference between sharding and partitioning?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Partitioning is a broad term: dividing data into subsets based on some criterion. Sharding is a specific type of horizontal partitioning where each partition (shard) lives on a separate database server (or cluster). Vertical partitioning: split columns across tables (user_profile table + user_activity table for the same user — separate frequently accessed columns from infrequently accessed ones). Horizontal partitioning (sharding): split rows across servers based on a shard key. Within a single database: table partitioning (PostgreSQL, MySQL) splits a table into physical partitions on the same server for query performance (partition pruning), but does not scale writes across servers. Sharding refers specifically to the multi-server case.”
}
},
{
“@type”: “Question”,
“name”: “How do you rebalance data when a shard becomes overloaded?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “When one shard is a hotspot (too much data or traffic): split it. Steps: (1) Choose a split point for the shard’s key range. (2) Create a new empty shard. (3) Copy data from the hot shard to the new shard for the key range to be migrated. (4) Use dual writes: write new data to both old and new shards during migration. (5) Verify data consistency (checksums or row counts). (6) Cut over: update the routing table to point the migrated key range to the new shard. (7) Stop dual writes. (8) Clean up data from the old shard. During migration, reads are served from both shards (read new, fall back to old). The entire process can be done online with zero downtime. Tools: Vitess (for MySQL), Citus (for PostgreSQL) automate this process.”
}
}
]
}
Asked at: Databricks Interview Guide
Asked at: Uber Interview Guide
Asked at: Netflix Interview Guide
Asked at: Cloudflare Interview Guide
Asked at: Shopify Interview Guide