Designing a hotel or home-sharing booking system like Airbnb is a comprehensive system design problem that combines availability calendars, concurrent booking prevention, search with geospatial queries, pricing engines, and payment workflows.
Core Requirements
Functional: Hosts list properties with pricing and availability. Guests search by location/dates/guests. Book a property (with payment). Cancel bookings (with refund policy). Messaging between hosts and guests. Reviews after stay.
Non-functional: 100M properties, 500M users, 1M bookings/day. Search < 500ms p99. No double-booking. Pricing updates in real-time. 99.99% uptime for booking flow.
Availability and Double-Booking Prevention
Core problem: two guests book the same property on overlapping dates
Naive approach (WRONG — race condition):
1. Guest A: check if Oct 1-5 available → YES
2. Guest B: check if Oct 3-7 available → YES
3. Guest A: create booking Oct 1-5 → SUCCESS
4. Guest B: create booking Oct 3-7 → SUCCESS (double-booked!)
Solution: optimistic locking with version counter
properties table:
id UUID PRIMARY KEY
host_id UUID
name TEXT
...
property_availability table:
property_id UUID
date DATE
status ENUM(available, booked, blocked)
booking_id UUID NULL
version INT DEFAULT 0
PRIMARY KEY (property_id, date)
Booking flow (atomic):
BEGIN;
-- Lock all dates in range (sorted to prevent deadlock)
SELECT * FROM property_availability
WHERE property_id = ? AND date BETWEEN ? AND ?
FOR UPDATE; -- row-level locks on each date
-- Verify all dates are available
IF any row has status != 'available': ROLLBACK; return 409
-- Create booking record
INSERT INTO bookings (id, property_id, guest_id, check_in, check_out, status)
VALUES (?, ?, ?, ?, ?, 'pending');
-- Mark dates as booked
UPDATE property_availability
SET status='booked', booking_id=?
WHERE property_id=? AND date BETWEEN ? AND ?;
COMMIT; -- releases all locks
Availability Calendar Design
Date-per-row model (recommended for flexibility):
One row per (property, date)
5 years × 365 days × 100M properties = 182B rows (use Cassandra or sharding)
Range model (compact, harder to query):
blocked_dates: (property_id, start_date, end_date, status)
Overlap check: start_date requested_start
Problem: updating a sub-range requires splitting existing ranges
Recommendation: Date-per-row model with Cassandra
Partition key: property_id
Clustering key: date
Supports: fast range scans, atomic date updates, O(1) availability check per date
Redis cache for availability (hot properties):
Key: avail:{property_id}:{year_month}
Value: bitmap where bit N = availability of day N
BITCOUNT to count available days in a month
GETBIT to check a specific date
TTL: 5 minutes (refresh on booking/cancellation)
Search Architecture
Search requirements: location + dates + guests + filters
"Find properties in San Francisco for 2 guests, Oct 1-5, with pool"
Two-phase search:
Phase 1: Geospatial candidate retrieval (fast)
Elasticsearch geo_distance query:
center: {lat: 37.7749, lon: -122.4194}
radius: 50km
filter: guests_capacity >= 2, amenities contains pool
Returns: ~1000 candidate property_ids
Phase 2: Availability filter (targeted)
For each candidate property_id, check availability table:
SELECT COUNT(*) FROM property_availability
WHERE property_id IN (candidates)
AND date BETWEEN Oct 1 AND Oct 5
AND status = 'available'
Keep only properties where COUNT = 5 (all nights available)
Phase 3: Ranking (ML model)
Score by: relevance, host rating, price, personalization
Return top 50 results
Geospatial indexing:
Elasticsearch: built-in geo_point field, geo_distance aggregations
PostgreSQL: PostGIS extension, ST_DWithin for radius queries
Alternative: H3 hexagonal cells for region-based partitioning
Pricing Engine
Dynamic pricing factors:
Base price: set by host
Demand boost: events nearby (concert +20%), holidays (+30%)
Supply factor: few properties available in area (+15%)
Lead time: last-minute discounts (-10%)
Length of stay: weekly discount (-5%), monthly discount (-20%)
Seasonal: summer peak in beach cities
Price calculation:
nightly_price = base_price
× demand_multiplier(location, date)
× length_of_stay_discount(nights)
× last_minute_factor(days_until_checkin)
total_price = sum(nightly_prices) + cleaning_fee + service_fee
Service fee: Airbnb takes ~3% from host + ~13% from guest
Real-time price display:
Price API: GET /properties/{id}/price?checkin=...&checkout=...
→ Computes per-night breakdown
→ Cached per (property, date_range) for 60 seconds
→ Invalidated when host changes base price
Booking State Machine
States:
PENDING → CONFIRMED → ACTIVE → COMPLETED
↓
CANCELLED (at any pre-active stage)
PENDING: payment authorized (not captured), awaiting host confirmation
CONFIRMED: host accepted, payment captured
ACTIVE: guest has checked in
COMPLETED: stay complete, reviews unlocked
CANCELLED: cancelled by guest or host; refund per policy
Instant Book (no host confirmation needed):
PENDING → CONFIRMED immediately
Payment captured synchronously
Refund policy variants:
Flexible: full refund if cancelled 24h before check-in
Moderate: full refund if cancelled 5 days before
Strict: 50% refund if cancelled 14 days before
Non-refundable: no refund
Cancellation flow:
1. Calculate refund amount per policy
2. Stripe refund via refunds.create() (async, 5-10 days)
3. Mark dates available again
4. Notify host via email/push
Database Design
bookings table:
id UUID PRIMARY KEY
property_id UUID NOT NULL
guest_id UUID NOT NULL
host_id UUID NOT NULL -- denormalized for fast host queries
check_in DATE NOT NULL
check_out DATE NOT NULL -- exclusive end date
guests_count INT
status TEXT DEFAULT 'pending'
total_cents BIGINT
currency CHAR(3) DEFAULT 'USD'
payment_intent_id TEXT -- Stripe PaymentIntent
created_at TIMESTAMPTZ DEFAULT NOW()
INDEX (guest_id, created_at DESC)
INDEX (host_id, check_in DESC)
INDEX (property_id, status, check_in)
properties table:
id UUID PRIMARY KEY
host_id UUID
name TEXT
description TEXT
lat DOUBLE PRECISION
lng DOUBLE PRECISION
guests_max INT
bedrooms INT
bathrooms NUMERIC(2,1)
amenities JSONB -- {pool: true, wifi: true, ...}
base_price_cents INT
cleaning_fee_cents INT
instant_book BOOLEAN DEFAULT FALSE
status TEXT DEFAULT 'active' -- active, suspended, deactivated
Interview Discussion Points
- Why SELECT FOR UPDATE instead of optimistic locking? Date-level availability requires locking all dates in the requested range atomically. Optimistic locking (check-and-update with version) works well for single-row updates but becomes complex for multi-row ranges (need to check N versions simultaneously). SELECT FOR UPDATE with sorted date locks is simpler and prevents deadlocks when two bookings compete for overlapping date ranges.
- How to handle Airbnb’s split payment model? Guests pay immediately (via Stripe Payment Intent authorize-only), hosts receive payment 24 hours after check-in. The platform holds the funds in escrow (Stripe Connect platform account). This protects both parties: if the host cancels last-minute, the guest gets a full refund; if the guest no-shows, the host gets paid per the cancellation policy.
- Search scalability: Elasticsearch handles the geospatial + filter query. The availability check runs against Cassandra (fast range reads by property_id + date). At 100M properties with 365 days each, a single Cassandra cluster with 20 nodes handles the availability reads at < 5ms p99.