Question 1

How does Redis store and query geospatial data for ride-sharing applications?

Accepted Answer

Redis Geo commands store lat/lng coordinates in a sorted set using Geohash encoding. Each member's position is encoded as a 52-bit integer Geohash and stored as the sorted set score. Key commands: GEOADD key longitude latitude member u2014 adds a location (O(log N)). GEODIST key member1 member2 [unit] u2014 distance between two members (O(log N)). GEOSEARCH key FROMMEMBER member BYRADIUS radius unit ASC COUNT 10 u2014 finds members within radius, sorted by distance (O(N + log M) where N = results, M = total members). For Uber's driver tracking: each active driver's position is stored via GEOADD active_drivers {lng} {lat} {driver_id}. When a rider requests a ride: GEOSEARCH active_drivers FROMLONLAT {rider_lng} {rider_lat} BYRADIUS 5 km ASC COUNT 20 returns up to 20 nearest drivers. The returned driver IDs are then used to fetch driver details (status, vehicle type, rating) from another Redis hash or database. Driver positions are updated every 4 seconds via GEOADD (which updates existing members). Offline/inactive drivers are removed with ZREM. The entire set of 5M active drivers fits comfortably in Redis memory (~200MB at ~40 bytes per driver entry).

Question 2

What is geohashing and how does it enable efficient location queries?

Accepted Answer

A geohash encodes a lat/lng pair into a short alphanumeric string by recursively dividing the Earth into 2 halves (first by longitude: left/right, then latitude: top/bottom) and encoding each division as a bit. After 30 bits (15 divisions of each axis), the string is base32 encoded. Key property: longer prefix = smaller geographic area; points with the same geohash prefix are geographically near each other. Precisions: 4 chars u2248 40km u00d7 20km; 6 chars u2248 1.2km u00d7 0.6km; 8 chars u2248 40m u00d7 20m. For ride-sharing: store each driver in a hash table keyed by their 6-character geohash prefix. Finding nearby drivers = look up the rider's geohash prefix + 8 adjacent cells. Eight neighbors must be included to avoid missing drivers just across a cell boundary u2014 geohash cells don't have uniform neighbor relationships (especially near poles). Algorithm: compute rider's geohash prefix, enumerate the 8 surrounding cells, union all drivers from all 9 cells, filter by actual distance using Haversine formula. Limitations: geohash cells are rectangular (not equidistant in all directions); near the 180u00b0 meridian and poles, the encoding has distortions. Alternatives: S2 geometry library (spherical cells, better neighbor relationships) and H3 hexagonal grid (6 equidistant neighbors per cell).

Question 3

How does Uber compute ETA for matching drivers to riders at scale?

Accepted Answer

ETA computation is on the critical path of ride matching u2014 it must complete in < 100ms for all candidate drivers. Approaches in order of complexity: (1) Straight-line distance + average speed u2014 fastest but inaccurate (ignores road network, one-way streets). (2) Routing engine (OSRM/Valhalla) u2014 queries a pre-computed road graph with current traffic weights. Accurate but ~20-50ms per query, and matching queries thousands of (driver, rider) pairs per second at Uber scale. (3) Learned ETA model u2014 a gradient boosted model (LightGBM) trained on millions of historical trips takes features: pickup lat/lng, driver lat/lng, time of day, day of week, weather, historical traffic patterns for that corridor. Outputs ETA in milliseconds. Accuracy within 10% of actual trip time for 80% of queries. This approach handles Uber's full query volume (millions of estimates/second) since ML inference is much cheaper than graph traversal. (4) Hybrid u2014 use the ML model for initial matching, then confirm with OSRM for the selected driver. Uber also pre-computes approximate ETAs on a coarse grid: for popular pickup zones (airports, downtown areas), precompute ETAs from nearby H3 cells containing active drivers. Cache in Redis for instant lookup during surge periods.

System Design Interview: Live Location Tracking (Uber / Lyft Driver GPS)

Overview

Location Update Ingestion

Location Storage: Redis for Current State

Geohashing for Partitioning

Matching Service: Finding Nearby Drivers

ETA Computation

Location History for Analytics

Surge Pricing via Location Data

Companies That Ask This