Q: How does Redis implement geospatial commands internally?

Redis geospatial commands (GEOADD, GEODIST, GEORADIUS) use a single sorted set (ZSET) internally. Each member's score is its geohash encoded as a 52-bit integer (using the Mercator-projected coordinates). The sorted set is indexed by this score. GEORADIUS: (1) Decode the query point to a geohash cell. (2) Find neighboring cells to cover the search radius. (3) Compute score ranges for those cells. (4) Perform ZRANGEBYSCORE queries for each range (O(log n + k) each). (5) Filter results by exact distance (haversine). This achieves O(log n + k) per query where k = result count. Limitation: Redis geospatial precision is approximately 0.6mm -- sufficient for all practical use cases.

Q: How do you implement geofencing for millions of moving entities at scale?

Naive approach: for each location update, check the entity against all geofences -- O(entities * geofences) per second, infeasible at scale. Spatial index approach: index all geofence bounding boxes in a PostGIS R-tree (GIST index). On each location update: (1) ST_DWithin query to find geofences whose bounding box contains the new point (fast index lookup). (2) Exact point-in-polygon test for candidates (ray casting or PostGIS ST_Contains). (3) Compare inside/outside state with previous state (from Redis) to generate ENTER or EXIT events. For very high update rates (millions per second): pre-shard geofences by region. Route each location update to the shard responsible for its geohash cell. Each shard handles a geographic subset of geofences. Kafka for location update stream; stateful Flink consumer for geofence evaluation.

Q: How do you handle location privacy and GDPR compliance for stored location data?

Location data is personal data under GDPR. Key requirements: (1) Consent: explicitly obtain consent before collecting location. Provide granular controls (e.g., share location only while using the app, not in background). (2) Data minimization: collect only the precision needed. If you need city-level location, do not store GPS coordinates. Round coordinates to ~100m precision. (3) Retention limits: define and enforce retention periods. Raw GPS: 7 days. Aggregated: 1 year. Delete automatically at expiry. (4) Right to erasure: on deletion request, delete all location records for the user within 72 hours. Include derived data (visit history, home/work location inferred from patterns). (5) Access controls: location data is restricted to specific roles and services. Log all accesses. Encrypt at rest with per-user keys -- key deletion destroys the data.

Question 1

What is the difference between GEOGRAPHY and GEOMETRY in PostGIS?

Accepted Answer

GEOMETRY: treats coordinates as flat 2D plane coordinates (Cartesian). Distance calculations are in the coordinate units (degrees, meters, etc. depending on SRID). Fast computation but inaccurate for large distances on Earth's curved surface. GEOGRAPHY: treats coordinates as lat/lng on Earth's sphere (WGS84, SRID 4326). Distance calculations use the haversine formula and return meters. Accurate for any distance on Earth but slower (~10% overhead vs. GEOMETRY). For most geospatial applications involving real-world locations: use GEOGRAPHY. The ST_DWithin function with GEOGRAPHY correctly computes "within N meters" whereas ST_DWithin with GEOMETRY would incorrectly treat degrees as distance units, giving wrong results at most latitudes.

Question 2

What are geohash neighbors and why must you check them for proximity queries?

Accepted Answer

Geohash encodes a lat/lng into a string where strings with the same prefix are in the same geographic area. The issue: cells at geohash boundaries are physically close but have different prefixes. Example: the left half of cell "u09t" and the right half of cell "u09s" may be < 10 meters apart but share no common prefix beyond 3 characters. If your proximity radius spans a geohash cell boundary, you must also check the 8 neighboring cells (N, NE, E, SE, S, SW, W, NW). Compute neighbors by decoding the geohash to lat/lng bounds and looking up the adjacent cell geohashes. Always query the center cell + 8 neighbors, then filter results to exact distance. This ensures you never miss a nearby point due to the boundary effect. At very small radii, a single cell may be sufficient -- this is tunable based on your radius vs. geohash precision.

Question 3

How does Redis implement geospatial commands internally?

Accepted Answer

Redis geospatial commands (GEOADD, GEODIST, GEORADIUS) use a single sorted set (ZSET) internally. Each member's score is its geohash encoded as a 52-bit integer (using the Mercator-projected coordinates). The sorted set is indexed by this score. GEORADIUS: (1) Decode the query point to a geohash cell. (2) Find neighboring cells to cover the search radius. (3) Compute score ranges for those cells. (4) Perform ZRANGEBYSCORE queries for each range (O(log n + k) each). (5) Filter results by exact distance (haversine). This achieves O(log n + k) per query where k = result count. Limitation: Redis geospatial precision is approximately 0.6mm -- sufficient for all practical use cases.

Question 4

How do you implement geofencing for millions of moving entities at scale?

Accepted Answer

Naive approach: for each location update, check the entity against all geofences -- O(entities * geofences) per second, infeasible at scale. Spatial index approach: index all geofence bounding boxes in a PostGIS R-tree (GIST index). On each location update: (1) ST_DWithin query to find geofences whose bounding box contains the new point (fast index lookup). (2) Exact point-in-polygon test for candidates (ray casting or PostGIS ST_Contains). (3) Compare inside/outside state with previous state (from Redis) to generate ENTER or EXIT events. For very high update rates (millions per second): pre-shard geofences by region. Route each location update to the shard responsible for its geohash cell. Each shard handles a geographic subset of geofences. Kafka for location update stream; stateful Flink consumer for geofence evaluation.

Question 5

How do you handle location privacy and GDPR compliance for stored location data?

Accepted Answer

Location data is personal data under GDPR. Key requirements: (1) Consent: explicitly obtain consent before collecting location. Provide granular controls (e.g., share location only while using the app, not in background). (2) Data minimization: collect only the precision needed. If you need city-level location, do not store GPS coordinates. Round coordinates to ~100m precision. (3) Retention limits: define and enforce retention periods. Raw GPS: 7 days. Aggregated: 1 year. Delete automatically at expiry. (4) Right to erasure: on deletion request, delete all location records for the user within 72 hours. Include derived data (visit history, home/work location inferred from patterns). (5) Access controls: location data is restricted to specific roles and services. Log all accesses. Encrypt at rest with per-user keys -- key deletion destroys the data.

System Design: Geospatial Platform — Proximity Search, Geofencing, and Location Data at Scale (2025)

Geospatial Data Fundamentals

Proximity Search: PostGIS and Redis

Geohash-Based Sharding

Geofencing at Scale

Location History and Privacy