System Design: Content Delivery Network (CDN)
A CDN is a geographically distributed network of edge servers that caches and serves content close to users. It reduces latency from hundreds of milliseconds (cross-continental origin requests) to single-digit milliseconds (nearby edge), while offloading 80-95% of traffic from origin servers.
Core Components
- Edge PoPs (Points of Presence): servers in ~200+ global locations that cache and serve content
- Origin: source of truth — your web servers, S3 buckets, or API servers
- DNS Anycast: routes user DNS queries to the nearest PoP IP
- Cache Tier: L1 (edge memory), L2 (edge SSD), L3 (regional shield) before hitting origin
Request Routing: How a User Reaches the Nearest Edge
User types example.com
→ ISP resolver queries → CDN authoritative DNS
→ DNS returns Anycast IP (routes to nearest PoP via BGP)
→ User TLS handshake with edge server (30ms vs 200ms to origin)
→ Edge: cache HIT → serve response (5ms)
→ Edge: cache MISS → fetch from origin, cache, serve (150ms first time)
Cache Key Design
The cache key determines whether two requests share a cache entry. Default: URL (scheme + host + path + query). Vary by:
Accept-Encoding: serve gzip to Chrome, brotli to FirefoxAccept-Language: if content is localized- Never vary on
Cookiefor public content — destroys cache hit rate - Strip tracking parameters (
utm_source,fbclid) from cache keys to consolidate variants
Cache Invalidation Strategies
# Strategy 1: TTL-based (simplest)
# Origin sets Cache-Control: public, max-age=3600
# Edge serves stale content up to 1 hour; no invalidation mechanism
# Strategy 2: Versioned URLs (best for static assets)
# /static/app.js?v=a1b2c3d4 — hash of file contents
# New deploy → new hash → new cache key → instant propagation
# Old URLs cached until TTL; safe because content is identical
# Strategy 3: CDN Purge API (for mutable content)
import requests
def purge_cdn_url(url: str, cdn_api_key: str) -> bool:
"""Cloudflare Cache Purge API example."""
response = requests.post(
"https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache",
headers={"Authorization": f"Bearer {cdn_api_key}"},
json={"files": [url]},
timeout=5,
)
return response.json().get("success", False)
# Strategy 4: Surrogate keys / Cache tags (advanced)
# Origin sets: Surrogate-Key: product-123 category-electronics
# Purge all URLs tagged product-123 in one API call when product changes
Cache Hit Rate Optimization
| Technique | Impact |
|---|---|
| Strip irrelevant query params from cache key | +10-20% HIT rate |
| Normalize URL (lowercase, sort params) | +5% HIT rate |
| Shield origin (regional PoP consolidates misses) | Reduces origin load 10x |
| Prefetch popular content at PoP startup | Eliminates cold start misses |
| Serve stale-while-revalidate | Removes TTL latency spikes |
Origin Shield (Tiered Caching)
Without shielding: 200 edge PoPs all independently fetch from origin on a cache miss → 200 origin requests per unique URL. With shielding: a regional “shield” PoP consolidates misses. Edge → Shield (L2 cache) → Origin. Typically 1 shield per region (US-East, EU-West, AP-South). Reduces origin requests from 200× to 3× for a global miss.
Dynamic Content and Edge Computing
Modern CDNs run code at the edge (Cloudflare Workers, Lambda@Edge) for:
- A/B testing: split traffic at edge without origin round-trip
- Auth token validation: reject unauthorized requests at edge, saving origin compute
- Response transformation: image resizing, HTML minification
- Geo-blocking: block by country before traffic reaches origin
Security Features
- DDoS absorption: CDN edge absorbs volumetric attacks (Tbps-scale); origin only sees legitimate traffic
- TLS termination: CDN handles TLS handshakes; origin can use HTTP internally on private network
- WAF (Web Application Firewall): filter SQLi, XSS, rate limiting at edge
- Bot management: fingerprinting + challenge pages at edge
Interview Framework
- Identify cacheable vs non-cacheable content (static assets vs user-specific API responses)
- Choose cache key strategy (URL only, or vary on headers)
- Define TTL and invalidation approach per content type
- Discuss origin shield for high-traffic origins
- Address security (DDoS, WAF) and edge compute if dynamic needs arise
Asked at: Cloudflare Interview Guide
Asked at: Netflix Interview Guide
Asked at: Uber Interview Guide
Asked at: Twitter/X Interview Guide
Asked at: Shopify Interview Guide