The Flash Sale Problem
A flash sale offers a limited quantity of items at a steep discount for a short time. The challenge: millions of users simultaneously attempt to purchase a few thousand units. This creates extreme write contention on inventory records — every request wants to decrement the same counter. The naive approach (decrement in the database for each request) leads to database overload, deadlocks, and incorrect inventory counts. Well-known examples: Amazon Lightning Deals, Alibaba’s Double 11 (Singles’ Day), Nike SNKRS launches.
Architecture Overview
The key insight: separate the traffic spike from the actual purchase processing. Layer 1 — Pre-sale validation (CDN/edge): serve the flash sale page from the CDN. Only allow purchase attempts after the sale start time (enforced at the edge). Layer 2 — Token gate (Redis): limit access to the purchase flow using a virtual waiting room or token bucket. Layer 3 — Inventory gate (Redis): check and reserve inventory atomically in Redis before writing to the database. Layer 4 — Order processing (async queue): confirmed inventory holders enter a queue; a worker processes orders at a sustainable rate. Layer 5 — Database (source of truth): the final order and inventory update are persisted asynchronously.
Inventory in Redis (Fast Gate)
Before the sale: pre-load the available quantity into Redis. SET flash:{sale_id}:inventory 10000. On each purchase attempt: use a Lua script for atomic check-and-decrement (Lua scripts run atomically in Redis — no race conditions):
-- Lua script: atomic check-and-decrement
local key = KEYS[1]
local qty = tonumber(ARGV[1])
local current = tonumber(redis.call('GET', key))
if current == nil or current < qty then
return -1 -- insufficient inventory
end
return redis.call('DECRBY', key, qty)
If the script returns -1: reject the request immediately (inventory exhausted). If it returns the new count: proceed to order creation. This handles 100,000+ requests per second with sub-millisecond latency. Redis single-threaded execution of Lua guarantees atomicity without external locking.
Virtual Waiting Room
To prevent the purchase flow from being overwhelmed: at sale start, instead of letting all users hit the checkout simultaneously, redirect them to a waiting room. Each user is assigned a random position in the queue (shuffle on arrival — prevents bots from gaining advantage by arriving first). Every N seconds, admit the next batch of users to the checkout flow. Implementation: Redis sorted set with score = random float (not timestamp). ZADD waitroom:{sale_id} RANDOM_SCORE user_id. Admit users: ZPOPMIN waitroom:{sale_id} COUNT N every T seconds. The admitted users receive a time-limited session token (30 minutes to complete checkout). Token gates access to the checkout page. Without a valid token, checkout redirects back to the waiting room.
Async Order Processing
Users who pass the Redis inventory gate receive an order reference number. Their purchase request is placed in a Kafka queue. Workers consume from the queue and finalize the order: validate payment, decrement database inventory, create order records. This decouples the burst of incoming purchase requests from the database write throughput. Workers process at a steady rate (e.g., 1,000 orders/second). Back-pressure: if the Kafka queue grows beyond a threshold, admit fewer users from the waiting room. Database inventory update: UPDATE inventory SET quantity = quantity – 1 WHERE sale_id = :id AND quantity > 0. If 0 rows updated (database inventory reached 0 before Redis, due to Redis pre-loading slightly more): mark the order as failed, refund the customer, send an apology notification.
Preventing Oversell
Three layers prevent oversell: (1) Redis Lua atomic decrement (fast gate — rejects when Redis counter reaches 0). (2) Database conditional update (quantity > 0 check) — final gate even if Redis and DB drift slightly. (3) Pre-load Redis with slightly fewer units than available (e.g., 9,800 of 10,000) to create a buffer for any Redis-DB sync discrepancy. Monitoring: alert if database inventory goes negative (bug indicator). Track the Redis-DB delta after each sale to tune the pre-load factor.
Asked at: Shopify Interview Guide
Asked at: Airbnb Interview Guide
Asked at: Snap Interview Guide
Asked at: Twitter/X Interview Guide