Question 1

How do you design a real-time leaderboard using Redis sorted sets?

Accepted Answer

Redis sorted sets (ZSETs) are the canonical data structure for leaderboards. Each member (player ID) maps to a floating-point score. Key operations: ZADD leaderboard {score} {player_id} to add/update; ZINCRBY leaderboard {delta} {player_id} to increment atomically; ZREVRANK leaderboard {player_id} to get rank (0-indexed, highest score = rank 0); ZREVRANGE leaderboard 0 9 WITHSCORES to get the top-10 with scores. All operations are O(log N) where N is the number of members. For millions of players, a single Redis ZSET handles this efficiently — Redis sorted sets use a skip list internally. For tie-breaking, encode as a composite score: composite = (score * 1e9) + (1e9 - timestamp_seconds), so equal-score players are ranked by who achieved the score first. At write scale (thousands of score updates per second), use Redis pipelining to batch ZINCRBY calls, or Lua scripts for atomic read-modify-write sequences.

Question 2

How do you implement a time-windowed leaderboard (daily, weekly)?

Accepted Answer

Use separate Redis ZSETs per time window with TTL-based expiry. Key scheme: leaderboard:daily:{YYYYMMDD}, leaderboard:weekly:{YYYY-W{week}}, leaderboard:all_time. On each score event, update all relevant keys: ZINCRBY leaderboard:daily:{today} {delta} {player_id}, ZINCRBY leaderboard:weekly:{week} {delta} {player_id}, ZINCRBY leaderboard:all_time {delta} {player_id}. Set TTL on time-windowed keys: EXPIRE leaderboard:daily:{yesterday} 604800 (keep 7 days for historical queries). For exact daily resets (midnight UTC), use a scheduled job that creates the new day key rather than relying on TTL, which only removes the key after it expires. At query time, ZREVRANGE leaderboard:daily:{today} 0 99 gives the top-100 for today. Fan-out writes (one event → multiple ZSET updates) are cheap since Redis is in-memory — all 3 ZINCRBY calls complete in microseconds.

Question 3

What is a Count-Min Sketch and when do you use it for Top-K problems?

Accepted Answer

A Count-Min Sketch (CMS) is a probabilistic data structure for frequency estimation using O(width * depth) space regardless of the number of distinct elements. Structure: a 2D array of width W and depth D counters, plus D independent hash functions. To increment item x: for each row i, increment counters[i][hash_i(x) % W]. To estimate frequency of x: return min(counters[i][hash_i(x) % W]) across all rows — the minimum reduces the impact of hash collisions. CMS overestimates but never underestimates. Typical parameters: width=1000, depth=5 gives ~1% error rate with 0.1% probability. Use CMS when: (1) the number of distinct items is too large to store exactly (billions of URLs, search queries); (2) you only need approximate heavy hitters, not exact counts. For Top-K with CMS: maintain a min-heap of size K alongside the CMS. On each increment, update CMS, then check if the estimated count exceeds the heap minimum — if so, push to heap and pop the smallest. This runs in O(D * log K) per event versus O(distinct_items) for exact counting.

System Design Interview: Design a Leaderboard / Top-K System

System Design: Top-K / Leaderboard (Heavy Hitters)

Requirements

Solution 1: Redis Sorted Sets (Recommended)

Solution 2: Database + Rank Computation

Solution 3: Count-Min Sketch (Approximate Heavy Hitters)

Sharded Leaderboard for Global Scale

Interview Tips