Designing a hotel or home-sharing booking system like Airbnb is a comprehensive system design problem that combines availability calendars, concurrent booking prevention, search with geospatial queries, pricing engines, and payment workflows.
Core Requirements
Functional: Hosts list properties with pricing and availability. Guests search by location/dates/guests. Book a property (with payment). Cancel bookings (with refund policy). Messaging between hosts and guests. Reviews after stay.
Non-functional: 100M properties, 500M users, 1M bookings/day. Search < 500ms p99. No double-booking. Pricing updates in real-time. 99.99% uptime for booking flow.
Availability and Double-Booking Prevention
Core problem: two guests book the same property on overlapping dates
Naive approach (WRONG — race condition):
1. Guest A: check if Oct 1-5 available → YES
2. Guest B: check if Oct 3-7 available → YES
3. Guest A: create booking Oct 1-5 → SUCCESS
4. Guest B: create booking Oct 3-7 → SUCCESS (double-booked!)
Solution: optimistic locking with version counter
properties table:
id UUID PRIMARY KEY
host_id UUID
name TEXT
...
property_availability table:
property_id UUID
date DATE
status ENUM(available, booked, blocked)
booking_id UUID NULL
version INT DEFAULT 0
PRIMARY KEY (property_id, date)
Booking flow (atomic):
BEGIN;
-- Lock all dates in range (sorted to prevent deadlock)
SELECT * FROM property_availability
WHERE property_id = ? AND date BETWEEN ? AND ?
FOR UPDATE; -- row-level locks on each date
-- Verify all dates are available
IF any row has status != 'available': ROLLBACK; return 409
-- Create booking record
INSERT INTO bookings (id, property_id, guest_id, check_in, check_out, status)
VALUES (?, ?, ?, ?, ?, 'pending');
-- Mark dates as booked
UPDATE property_availability
SET status='booked', booking_id=?
WHERE property_id=? AND date BETWEEN ? AND ?;
COMMIT; -- releases all locks
Availability Calendar Design
Date-per-row model (recommended for flexibility):
One row per (property, date)
5 years × 365 days × 100M properties = 182B rows (use Cassandra or sharding)
Range model (compact, harder to query):
blocked_dates: (property_id, start_date, end_date, status)
Overlap check: start_date requested_start
Problem: updating a sub-range requires splitting existing ranges
Recommendation: Date-per-row model with Cassandra
Partition key: property_id
Clustering key: date
Supports: fast range scans, atomic date updates, O(1) availability check per date
Redis cache for availability (hot properties):
Key: avail:{property_id}:{year_month}
Value: bitmap where bit N = availability of day N
BITCOUNT to count available days in a month
GETBIT to check a specific date
TTL: 5 minutes (refresh on booking/cancellation)
Search Architecture
Search requirements: location + dates + guests + filters
"Find properties in San Francisco for 2 guests, Oct 1-5, with pool"
Two-phase search:
Phase 1: Geospatial candidate retrieval (fast)
Elasticsearch geo_distance query:
center: {lat: 37.7749, lon: -122.4194}
radius: 50km
filter: guests_capacity >= 2, amenities contains pool
Returns: ~1000 candidate property_ids
Phase 2: Availability filter (targeted)
For each candidate property_id, check availability table:
SELECT COUNT(*) FROM property_availability
WHERE property_id IN (candidates)
AND date BETWEEN Oct 1 AND Oct 5
AND status = 'available'
Keep only properties where COUNT = 5 (all nights available)
Phase 3: Ranking (ML model)
Score by: relevance, host rating, price, personalization
Return top 50 results
Geospatial indexing:
Elasticsearch: built-in geo_point field, geo_distance aggregations
PostgreSQL: PostGIS extension, ST_DWithin for radius queries
Alternative: H3 hexagonal cells for region-based partitioning
Pricing Engine
Dynamic pricing factors:
Base price: set by host
Demand boost: events nearby (concert +20%), holidays (+30%)
Supply factor: few properties available in area (+15%)
Lead time: last-minute discounts (-10%)
Length of stay: weekly discount (-5%), monthly discount (-20%)
Seasonal: summer peak in beach cities
Price calculation:
nightly_price = base_price
× demand_multiplier(location, date)
× length_of_stay_discount(nights)
× last_minute_factor(days_until_checkin)
total_price = sum(nightly_prices) + cleaning_fee + service_fee
Service fee: Airbnb takes ~3% from host + ~13% from guest
Real-time price display:
Price API: GET /properties/{id}/price?checkin=...&checkout=...
→ Computes per-night breakdown
→ Cached per (property, date_range) for 60 seconds
→ Invalidated when host changes base price
Booking State Machine
States:
PENDING → CONFIRMED → ACTIVE → COMPLETED
↓
CANCELLED (at any pre-active stage)
PENDING: payment authorized (not captured), awaiting host confirmation
CONFIRMED: host accepted, payment captured
ACTIVE: guest has checked in
COMPLETED: stay complete, reviews unlocked
CANCELLED: cancelled by guest or host; refund per policy
Instant Book (no host confirmation needed):
PENDING → CONFIRMED immediately
Payment captured synchronously
Refund policy variants:
Flexible: full refund if cancelled 24h before check-in
Moderate: full refund if cancelled 5 days before
Strict: 50% refund if cancelled 14 days before
Non-refundable: no refund
Cancellation flow:
1. Calculate refund amount per policy
2. Stripe refund via refunds.create() (async, 5-10 days)
3. Mark dates available again
4. Notify host via email/push
Database Design
bookings table:
id UUID PRIMARY KEY
property_id UUID NOT NULL
guest_id UUID NOT NULL
host_id UUID NOT NULL -- denormalized for fast host queries
check_in DATE NOT NULL
check_out DATE NOT NULL -- exclusive end date
guests_count INT
status TEXT DEFAULT 'pending'
total_cents BIGINT
currency CHAR(3) DEFAULT 'USD'
payment_intent_id TEXT -- Stripe PaymentIntent
created_at TIMESTAMPTZ DEFAULT NOW()
INDEX (guest_id, created_at DESC)
INDEX (host_id, check_in DESC)
INDEX (property_id, status, check_in)
properties table:
id UUID PRIMARY KEY
host_id UUID
name TEXT
description TEXT
lat DOUBLE PRECISION
lng DOUBLE PRECISION
guests_max INT
bedrooms INT
bathrooms NUMERIC(2,1)
amenities JSONB -- {pool: true, wifi: true, ...}
base_price_cents INT
cleaning_fee_cents INT
instant_book BOOLEAN DEFAULT FALSE
status TEXT DEFAULT 'active' -- active, suspended, deactivated
Interview Discussion Points
- Why SELECT FOR UPDATE instead of optimistic locking? Date-level availability requires locking all dates in the requested range atomically. Optimistic locking (check-and-update with version) works well for single-row updates but becomes complex for multi-row ranges (need to check N versions simultaneously). SELECT FOR UPDATE with sorted date locks is simpler and prevents deadlocks when two bookings compete for overlapping date ranges.
- How to handle Airbnb’s split payment model? Guests pay immediately (via Stripe Payment Intent authorize-only), hosts receive payment 24 hours after check-in. The platform holds the funds in escrow (Stripe Connect platform account). This protects both parties: if the host cancels last-minute, the guest gets a full refund; if the guest no-shows, the host gets paid per the cancellation policy.
- Search scalability: Elasticsearch handles the geospatial + filter query. The availability check runs against Cassandra (fast range reads by property_id + date). At 100M properties with 365 days each, a single Cassandra cluster with 20 nodes handles the availability reads at < 5ms p99.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How do you prevent double-booking in a hotel reservation system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Double-booking is prevented by using database row-level locking on the availability records for each requested date. When a guest attempts to book, the system begins a database transaction and issues SELECT FOR UPDATE on all availability rows for the property and date range (sorted by date to prevent deadlock). This blocks any concurrent booking attempting to lock the same rows. After verifying all dates show status=’available’, the system inserts the booking record and updates the availability rows to status=’booked’ atomically within the transaction. If any date is already booked, the transaction is rolled back and the guest receives a 409 Conflict response. This guarantees exactly-once booking even under high concurrent load.”
}
},
{
“@type”: “Question”,
“name”: “How does the availability calendar work in an Airbnb-style system at scale?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The availability calendar uses a date-per-row model: one database row per (property_id, date) storing the availability status. This enables fast range scans (check availability for a 7-night stay = 7 row reads), atomic date-level updates (lock only the dates being booked), and simple status transitions. For 100M properties u00d7 365 days u00d7 5 years = 182B rows, a wide-column database like Cassandra or DynamoDB is used with property_id as the partition key and date as the clustering key. A Redis bitmap cache (1 bit per day per property, TTL 5 minutes) provides instant availability checks without database reads for popular properties u2014 invalidated on every booking or cancellation.”
}
},
{
“@type”: “Question”,
“name”: “How does geospatial search work in a property booking platform?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Property search uses a two-phase approach. Phase 1 (candidate retrieval): an Elasticsearch cluster with geo_point indexed fields handles “find all properties within 50km of San Francisco with capacity >= 2 and pool amenity” u2014 this geospatial + filter query returns ~1,000 candidate property IDs in < 100ms. Phase 2 (availability filter): for each candidate property, the system checks the availability database to verify all requested dates are open. This targeted lookup (1,000 properties u00d7 7 nights = 7,000 row reads) runs in parallel and takes ~50ms. Results are then scored by a ranking model (price, rating, host acceptance rate, personalization) and the top 50 are returned to the client."
}
}
]
}