Why Distributed Locks?
A mutex works within a single process. A database row lock works within a single database. When multiple application servers must coordinate access to a shared resource (run a cron job on exactly one server, prevent two checkouts from reserving the same inventory), you need a distributed lock — one that works across servers and data centers.
Redis-based Locking (Simple)
SET lock:{resource} {unique_id} NX PX {ttl_ms}. NX: set only if not exists (atomic acquisition). PX {ttl_ms}: expire automatically if the holder crashes (prevents deadlock). The unique_id (UUID) identifies the lock owner — critical for safe release.
Safe release (Lua script for atomicity): if the current value matches the owner UUID, delete the key. Without this check: a slow process might release a lock acquired by another process (its own lock expired, another acquired it, then the slow process deletes the new lock). The Lua script is atomic on Redis (single-threaded).
-- Lua script for safe release
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
end
Redlock Algorithm
The simple Redis lock fails if the Redis master crashes before replication — the new master has no lock record. Redlock uses N independent Redis instances (N=5 is typical). To acquire: try to SET the lock on all N instances. If a majority (ceil(N/2)+1 = 3) succeed within a timeout, the lock is acquired. The lock TTL is reduced by the acquisition time. To release: release on all N instances. Properties: tolerates up to (N-1)/2 = 2 instance failures. The clock drift assumption is the main criticism (Martin Kleppmann argued that distributed systems with clocks and pauses make Redlock unsafe in theory; in practice it works for most use cases with careful TTL settings).
ZooKeeper / etcd-based Locking
ZooKeeper uses ephemeral sequential nodes. To acquire: create a node /locks/mylock/guid- (sequential). List all children; if your node has the lowest sequence number, you have the lock. If not, watch the node with the next-lower sequence number — when it is deleted, recheck. ZooKeeper guarantees linearizability and handles sessions (ephemeral nodes are deleted when the session expires, releasing the lock automatically on crash). etcd uses leases (TTL-based) and compare-and-swap operations for similar semantics. More heavyweight than Redis but stronger consistency guarantees.
Database-based Locking
INSERT INTO distributed_locks (resource, owner, expires_at) VALUES (X, Y, NOW() + INTERVAL 30 seconds). Unique constraint on resource prevents duplicate locks. On failure (constraint violation): another holder has the lock. Heartbeat: holder updates expires_at every 10 seconds to prevent expiry while active. To release: DELETE WHERE resource = X AND owner = Y. Clean up expired locks: DELETE WHERE expires_at < NOW() (run periodically or on next acquisition attempt). Simpler than Redis; no extra infrastructure; higher latency (database round trip vs Redis in-memory). Good for: applications already running on a database, infrequent lock acquisitions.
Fencing Tokens
Even with a perfect distributed lock, a process can hold a lock, pause (GC, network partition), have its lock expire, another process acquires the lock — both believe they hold the lock. Solution: fencing tokens. On lock acquisition, return a monotonically increasing token. When writing to the protected resource, include the token. The resource rejects writes with tokens older than the last accepted token. Only one process (the one with the highest token) can successfully write, regardless of clock drift or pauses.
When to Use Each
| Mechanism | Latency | Reliability | Complexity | Use When |
|---|---|---|---|---|
| Simple Redis SET NX | Very Low | Medium | Low | Single Redis, tolerate rare failures |
| Redlock | Low | High | Medium | Multi-datacenter, high reliability needed |
| ZooKeeper / etcd | Medium | Very High | High | Critical coordination, strong consistency |
| Database lock | Medium | High | Low | Simple use case, DB already available |
Asked at: Netflix Interview Guide
Asked at: Cloudflare Interview Guide
Asked at: Databricks Interview Guide
Asked at: Atlassian Interview Guide
Asked at: Coinbase Interview Guide