Problem Statement
Design a search autocomplete (typeahead) system like Google Search suggestions or Amazon product search. As a user types, the system returns the top 5 completions ranked by query frequency/relevance within 100ms. Handle 100M unique queries per day, billions of historical queries for ranking, and real-time updates to suggestions.
Core Data Structure: Trie
from dataclasses import dataclass, field
import heapq
@dataclass
class TrieNode:
children: dict = field(default_factory=dict) # char -> TrieNode
is_end: bool = False
frequency: int = 0 # Total queries through this node's complete word
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word: str, frequency: int = 1):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end = True
node.frequency += frequency
def get_suggestions(self, prefix: str, top_k: int = 5) -> list[tuple[int, str]]:
"""Return top_k (frequency, word) tuples for the given prefix."""
node = self.root
for char in prefix:
if char not in node.children:
return [] # Prefix not found
node = node.children[char]
# DFS from the prefix node to collect all completions
results = []
self._dfs(node, prefix, results)
# Return top-k by frequency (max-heap)
return heapq.nlargest(top_k, results, key=lambda x: x[0])
def _dfs(self, node: TrieNode, current: str,
results: list):
if node.is_end:
results.append((node.frequency, current))
for char, child in node.children.items():
self._dfs(child, current + char, results)
Problem with naive trie: DFS from the prefix node has O(total_words_with_prefix) complexity. With billions of queries, this is too slow. Solution: store top-k completions at each trie node.
Optimized Trie with Cached Top-K
@dataclass
class OptimizedTrieNode:
children: dict = field(default_factory=dict)
# Cache top-5 completions at this node: [(frequency, word), ...]
top_k: list = field(default_factory=list)
TOP_K = 5
class OptimizedTrie:
def __init__(self):
self.root = OptimizedTrieNode()
def insert(self, word: str, frequency: int):
node = self.root
for i, char in enumerate(word):
if char not in node.children:
node.children[char] = OptimizedTrieNode()
node = node.children[char]
# Update top-k at each prefix node
self._update_top_k(node, word, frequency)
def _update_top_k(self, node: OptimizedTrieNode,
word: str, frequency: int):
# Add (frequency, word) and keep only top-k
node.top_k.append((frequency, word))
# Keep sorted, trim to TOP_K
node.top_k = heapq.nlargest(
OptimizedTrieNode.TOP_K, node.top_k, key=lambda x: x[0]
)
def get_suggestions(self, prefix: str) -> list[str]:
node = self.root
for char in prefix:
if char not in node.children:
return []
node = node.children[char]
return [word for _, word in node.top_k]
# get_suggestions is now O(|prefix|) — just traverse the trie path
# No DFS needed — suggestions pre-cached at each node
Distributed Architecture
A single trie can’t fit all queries for a global service. Distributed approach:
Data Collection Pipeline
class QueryLogger:
"""Logs search queries for aggregation."""
def log_query(self, query: str, user_id: str):
# Append to Kafka topic: autocomplete.raw_queries
kafka.produce("autocomplete.raw_queries", {
"query": query.lower().strip(),
"user_id": user_id,
"timestamp": time.time()
})
# Offline aggregation (Apache Spark, daily batch):
# 1. Read raw_queries from Kafka / data lake
# 2. Count frequency of each query in the last 7 days (recency-weighted)
# 3. Filter: remove profanity, spam, low-frequency queries (< 10 occurrences)
# 4. Build trie from aggregated (query, frequency) pairs
# 5. Serialize trie and distribute to autocomplete servers
# Recency weighting: queries from today count more than queries from 7 days ago
# weight = frequency * (1 / days_ago_factor)
Trie Partitioning
Partition the trie by prefix to distribute across servers:
- Server 1: a-m (all queries starting with letters a through m)
- Server 2: n-z (all queries starting with n through z)
- The zookeeper/config service maps first-letter → server address
- API Gateway routes query to the correct server based on the first character of the prefix
Read Path
class AutocompleteService:
def __init__(self, trie_shards: dict, cache: redis.Redis):
self.trie_shards = trie_shards # first_char -> TrieServer
self.cache = cache
def get_suggestions(self, prefix: str) -> list[str]:
prefix = prefix.lower().strip()
if not prefix:
return []
# 1. Cache check (Redis, TTL=60s)
cache_key = f"ac:{prefix}"
cached = self.cache.get(cache_key)
if cached:
return json.loads(cached)
# 2. Route to appropriate trie shard
shard = self.trie_shards.get(prefix[0], self.trie_shards['default'])
suggestions = shard.get_suggestions(prefix)
# 3. Cache result
self.cache.setex(cache_key, 60, json.dumps(suggestions))
return suggestions
Real-Time Updates
The daily batch is too slow for trending queries (a celebrity news event should surface in minutes). Two-tier approach:
- Base trie: Built daily from 7 days of historical data. Stable, accurate frequency counts.
- Realtime layer: Kafka Streams or Flink consumes the raw query stream in real-time, maintaining a rolling 1-hour frequency count in Redis. Results merged with base trie at query time:
merged_score = base_freq + alpha * realtime_freq.
Filtering and Personalization
class SuggestionFilter:
def __init__(self, profanity_list: set, spam_patterns: list):
self.blocked_words = profanity_list
self.spam_patterns = spam_patterns
def filter(self, suggestions: list[str],
user_history: list[str]) -> list[str]:
# 1. Remove blocked content
clean = [s for s in suggestions
if not any(w in s for w in self.blocked_words)]
# 2. Boost suggestions that match user's search history
def score(suggestion: str) -> float:
base = 1.0
if suggestion in user_history:
base *= 1.5 # Personalization boost
return base
return sorted(clean, key=score, reverse=True)[:5]
Interview Checklist
- Data structure: trie with top-k cached at each node for O(|prefix|) query time
- Scale: partition trie by first character across multiple servers
- Ranking: query frequency, recency weighting, personalization
- Pipeline: daily batch Spark job for base trie + real-time Flink for trending
- Cache: Redis at query layer (TTL 60s), reduces trie server load
- Filtering: profanity filter, spam detection before serving suggestions
- Latency budget: 100ms total — 5ms cache hit, 20ms trie lookup on miss
🏢 Asked at: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale
🏢 Asked at: Twitter / X Interview Guide 2026: Real-Time Systems, Feed Architecture, and Platform Engineering
🏢 Asked at: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture
🏢 Asked at: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering