System Design Interview: Design a Maps and Navigation System (Google Maps)
Designing a maps and navigation system is asked at Uber, Lyft, Google, Apple, and Grab. It covers geospatial data storage, routing algorithms, real-time traffic, map tile serving, and ETA prediction.
Requirements Clarification
Functional Requirements
- Display map tiles at different zoom levels
- Search for locations (addresses, POIs)
- Route from point A to point B (driving, walking, transit)
- Turn-by-turn navigation with live traffic updates
- ETA estimation
- Report incidents (accidents, road closures)
Non-Functional Requirements
- Scale: 1B users, 1M active navigators at peak
- Map tile serving: <100ms
- Routing: <500ms for cross-city routes
- Traffic update freshness: <1 minute
Map Data Storage
Road Network: Graph Database
Roads and intersections modeled as a directed weighted graph:
- Nodes: intersections, points of interest
- Edges: road segments with properties (distance, speed limit, road type, one-way)
- Edge weights: current travel time (updated with real-time traffic)
Storage: custom binary graph format for routing engine (not a general-purpose graph DB). Graph partitioned by geographic tile for distributed routing. OpenStreetMap (OSM) is the source data for many mapping systems.
Geospatial Indexing
Find nearest nodes, POIs, or road segments to a GPS coordinate:
- Geohash: encode lat/lng as a base-32 string. Nearby geohashes are lexicographically similar. Use as index key for proximity queries. Resolution depends on string length (6 chars = ~1km).
- Quadtree: recursively divide map into 4 quadrants. Navigate tree to find entities in a bounding box. Used for spatial indexing in PostGIS, S2.
- S2 (Google): Hilbert curve-based spatial indexing. Maps 2D sphere to 1D curve preserving locality. Google Maps uses S2 cells internally.
Map Tile Serving
Maps are rendered as tiles (256×256 or 512×512 pixel PNG/WebP images) at different zoom levels (0-22). Tile URL: /tiles/{z}/{x}/{y}.png where z=zoom, x/y=tile coordinates.
Tile generation pipeline:
Raw OSM data -> Tile rendering (Mapnik/vector tiles) -> Tile storage (S3)
|
CDN edge cache
|
User browser
Tiles are pre-rendered and cached aggressively (TTL: days for base tiles, minutes for traffic overlay). Cache hit rate: 99%+ for popular areas. Vector tiles (MVT format) are smaller and rendered client-side with styling flexibility.
Routing Algorithm
Dijkstra vs A*
- Dijkstra: finds shortest path in weighted graph. Explores all nodes with cost <= current best. O((V+E) log V) with priority queue. Too slow for continental routing (billions of nodes).
- A* (A-star): Dijkstra + heuristic (straight-line distance to destination). Guides search toward destination, exploring fewer nodes. Optimal when heuristic is admissible (never overestimates).
Contraction Hierarchies (CH)
Used in production routing engines (OSRM, Valhalla, Google Maps). Preprocessing step contracts unimportant nodes, adding shortcut edges. Query: bidirectional Dijkstra on contracted graph. Route from New York to LA in milliseconds. 1000x faster than plain Dijkstra on real road networks.
Real-Time Traffic
Two sources of traffic data:
- Probe data: GPS coordinates from phones of active navigators. Aggregate speed on road segments. Privacy: aggregate anonymized data, not individual tracks.
- Incident reports: user reports (accidents, closures) + automated detection from probe speed drops.
Phone GPS -> Traffic Ingest API -> Kafka
|
Stream Processor (Flink)
- Map-match GPS to road segment
- Compute average speed per segment
- Detect speed anomalies (incidents)
|
Traffic DB (edge weight updates)
|
Routing engine reads updated weights
ETA Prediction
Simple ETA: sum of (segment_distance / current_speed) along route. Production ETA uses ML: gradient boosted model trained on historical travel times. Features: time of day, day of week, weather, current traffic, route characteristics. ETA updated every 30s during navigation.
Search and Geocoding
- Forward geocoding: address string to lat/lng. Elasticsearch index on POI and address data.
- Reverse geocoding: lat/lng to address. Spatial index lookup (quadtree/S2) to find nearest road segment and building.
- Autocomplete: prefix search on indexed place names. Redis sorted set or Elasticsearch edge n-grams.
Interview Tips
- Lead with geospatial indexing (geohash, S2) – it is the foundational concept
- Explain why plain Dijkstra is too slow and introduce Contraction Hierarchies
- Discuss map tiles: pre-rendering, CDN caching, vector vs raster
- Explain how probe GPS data becomes traffic edge weights (map-matching)
- Know ETA = route time sum, updated with real-time traffic
Companies that ask this: Snap Interview Guide
Companies that ask this: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale
Companies that ask this: Apple Interview Guide 2026: iOS Systems, Hardware-Software Integration, and iCloud Architecture
Companies that ask this: DoorDash Interview Guide
Companies that ask this: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems
Companies that ask this: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering