Question 1

How does Geohash indexing enable efficient proximity searches?

Accepted Answer

Geohash encodes a (lat, lng) coordinate into a short alphanumeric string by recursively subdividing the world into cells. Characters share a common prefix if they are geographically close. A 6-character geohash covers roughly a 1.2km x 0.6km cell; 7 characters cover ~150m x 75m. For proximity search: compute the geohash of the query point, compute its 8 neighbors (N, NE, E, SE, S, SW, W, NW) using standard neighbor algorithms, then query: WHERE geohash LIKE {prefix}% OR geohash IN ({8 neighbors}). This is efficient with a B-tree index on geohash. The edge case: a target just outside the search radius may be in a neighboring cell not queried, and a target just inside the radius may be far in the same cell. Always apply a secondary Haversine distance filter on the candidates returned by the geohash query to get exact results.

Question 2

What are the trade-offs between Geohash and Quadtree for spatial indexing?

Accepted Answer

Geohash: fixed-grid cell decomposition. Encodes coordinates to a string; stored in standard SQL/NoSQL indexes. Simple to implement. Problem: cells are fixed size — high-density areas (Manhattan) and low-density areas (Nevada desert) get the same cell size, causing hot cells with thousands of businesses vs. cold cells with none. Quadtree: recursive spatial decomposition that adapts to data density. A cell is split only when it exceeds a threshold (e.g., 50 businesses per node). Results in balanced tree depth regardless of geographic density. Best for: driver tracking systems where drivers cluster in cities, Yelp-style local search in mixed urban/rural areas. Trade-off: Quadtree requires an in-memory or distributed tree structure, not a simple string index. Geohash is simpler and works well when data density is reasonably uniform. Quadtrees handle hot-spot density skew better.

Question 3

How do you handle the geofencing use case at scale?

Accepted Answer

Geofencing: trigger an event when a moving entity (driver, user) enters or exits a defined polygon (delivery zone, restricted area). At scale (1M geofences, 100K moving entities): (1) Spatial index all geofences with R-tree or PostGIS GIST index. Index geofence bounding boxes first for coarse filtering, then do precise polygon containment check only on candidates. (2) For real-time entity movement: each entity reports position every 5 seconds. Compute which geofences contain the new position. Compare against which geofences contained the previous position. Difference = enter/exit events. (3) At 100K entities * 5-second updates = 20K position updates/second. Use a Redis geospatial set (GEOADD) for fast neighbor lookup, batch geofence checks, and publish enter/exit events to Kafka for downstream processing (notifications, billing, analytics).

System Design Interview: Design a Location-Based Services System (Yelp/Google Maps)

What Is a Location-Based Services System?

System Requirements

Functional

Non-Functional

Geospatial Indexing: Geohash

Alternative: Quadtree

Data Model

Search Flow

Caching

Geofencing

Interview Tips