Question 1

How does client-side debouncing reduce autocomplete server load?

Accepted Answer

Without debouncing: every keystroke fires a request. Typing "search" generates 6 requests (s, se, sea, sear, searc, search). At 50M users typing simultaneously: hundreds of millions of requests per second. With debouncing: the client waits for a pause in typing (typically 100-200ms) before sending a request. Fast typists generate only 1-2 requests per search query instead of 6-10. Combined with client-side caching: if the user types "sea" and then backspaces to "se" (which was already fetched), the cached result is shown immediately without a server request. Implementation: use a debounce function that resets a timer on every keystroke. The request fires only when the timer expires without a new keystroke. In JavaScript: let timer = null; input.addEventListener("keyup", () => { clearTimeout(timer); timer = setTimeout(fetchSuggestions, 150); }). This single optimization reduces autocomplete API traffic by 70-80% compared to naive per-keystroke requests.

Question 2

How do you rank autocomplete suggestions to balance popularity and freshness?

Accepted Answer

Raw popularity (historical query count) favors established queries and lags behind trends. A query from 2 years ago with 10M searches may rank above a trending query with 100K searches this week. Time-decay ranking: score = historical_count * decay_factor + recent_count * (1 - decay_factor), where decay_factor (e.g., 0.7) weights historical data and (1 - decay_factor) weights recent data. Decay on historical counts: multiply historical_count by 0.95 per week -- gradually reduces the weight of old queries. Trending boost: queries whose frequency grew > 2x in the past 24 hours relative to the previous 7-day average get a trending multiplier (e.g., 1.5x). Personalization reranking: after computing the global top-K, reorder based on the user's past query history and click-through rates. User-specific terms should rank higher than global averages. Safe search filtering: apply safe search settings to filter age-restricted or sensitive completions. Different users see different ranked suggestions based on their preferences. A/B testing: test different ranking formulas to measure click-through rate on suggestions and adjust weights.

Question 3

How do you handle multi-word or phrase autocomplete differently from single-word?

Accepted Answer

Single-word trie works for completing individual tokens ("appl" u2192 "apple", "application", "apply"). For phrases ("how to", "best restaurants in"), the prefix space is much larger and the trie approach applies to the full phrase. Adaptations: (1) Phrase trie: index the entire phrase string in the trie. Works but the trie is much larger. (2) N-gram model: store the most frequent (word1, word2) bigrams and trigrams. When the user types "how to", look up frequent completions of "how to" as a unit. (3) Query segmentation: split the prefix at the last word boundary. Complete the last word using a word-level trie, then filter completions by the full context (pre-word + completed last word) to ensure the full phrase is popular. (4) Language model: use a neural language model (GPT-style) to generate completions that are contextually appropriate to the full prefix. Used by modern search engines for complex multi-word queries. For most interview purposes: the phrase trie or n-gram completion approach is sufficient. Neural approaches are mentioned as a production enhancement.

Question 4

How would you design the autocomplete for an e-commerce product search?

Accepted Answer

E-commerce autocomplete has additional complexity: products change, categories matter, and out-of-stock products should be deprioritized. Suggestions come from three sources: (1) Popular search queries: "iPhone 15 case", "wireless headphones" -- indexed in a query trie with frequency. (2) Product names: "Apple AirPods Pro", "Sony WH-1000XM5" -- indexed separately. Prefix match on product title. (3) Categories and brands: "Apple", "Nike", "Headphones" -- quick navigation shortcuts. Ranking: blend query popularity (past search behavior) + product inventory (in-stock products rank higher) + revenue (high-margin products slightly boosted). Result type annotations: show the result type in the dropdown (query suggestion vs product vs category). Real-time inventory updates: product out-of-stock is a business event that should remove the product from autocomplete suggestions within minutes. Update via a Kafka event u2192 autocomplete index update u2192 trie refresh. Personalization: for returning customers, boost brands and categories they've purchased from before. Implementation: Elasticsearch's completion suggester handles most of this with configurable contexts and weights, which is production-ready faster than building a custom trie.

Question 5

What is the time and space complexity of a trie vs a sorted list for autocomplete?

Accepted Answer

Trie prefix lookup: O(P) where P = prefix length. Returning top-K stored at each node: O(1) if precomputed, O(subtree size) if not. Space: O(total characters across all words * branching factor). For 26-character alphabet: each node has up to 26 children pointers. A trie with 1M words averaging 20 chars = up to 20M nodes * ~200 bytes per node = ~4GB worst case (actual usage is much lower due to prefix sharing). Sorted list with binary search: inserting a word is O(n) (shift). Binary search to find first match: O(log n). Finding all K matches starting from that position: O(K). Space: O(total string lengths) -- more memory-efficient than a trie for small datasets. Comparison: for large datasets with many shared prefixes (common in natural language), the trie has better time complexity for lookups and uses space proportional to unique characters, not unique words. For small datasets (< 100K words), a sorted list is simpler and may be more practical. For autocomplete at scale: trie with precomputed top-K per node is standard. The precomputation tradeoff: O(1) lookup at query time, O(L * K) update time -- acceptable for read-heavy autocomplete where updates are batched.

System Design: Typeahead / Search Autocomplete — Trie Service, Ranking, and Low-Latency Delivery

Requirements

Data Structure: Trie with Top-K Precomputation

Architecture

Trie Partitioning and Scale

Interview Tips

Requirements

Data Structure: Trie with Top-K Precomputation

Architecture

Trie Partitioning and Scale

Trending Query Updates

Interview Tips