Question 1

When should you use a Trie instead of a hash set for string problems?

Accepted Answer

A hash set answers "does this exact string exist?" in O(L) amortized (L=string length for hashing). A Trie answers the same query in O(L) but also answers prefix queries: "does any word start with this prefix?" and "find all words with this prefix." Hash set has no prefix query capability — you would need to iterate all stored strings, O(N*L). Use Trie when: (1) prefix queries are needed (autocomplete, IP routing, spell check). (2) Multiple words need to be searched simultaneously in a structure (Word Search II: building a Trie of all target words enables early DFS pruning when the current path prefix matches no word). (3) You need to find words by wildcard patterns (Trie DFS with backtracking at "." nodes). Hash set when: (1) only exact lookups needed. (2) Memory is a concern (hash set stores strings compactly; Trie uses a node per character with child pointers). Memory comparison: 1000 words of average 5 chars — hash set: ~5000 chars; Trie: up to 5000 nodes with child dicts — more memory but enables O(prefix_length) prefix queries.

Question 2

How do you implement autocomplete with O(prefix_length) lookup using a Trie?

Accepted Answer

Standard Trie: traverse to the prefix node in O(L), then DFS the entire subtree to collect all words — O(N) for N words with that prefix. For large dictionaries, this is slow. Optimization: cache top-K popular words at every Trie node. Each node stores a list of the top-K (e.g., 10) most frequently searched completions that pass through it. Insert: O(L * K log K) — update the top-K list at each of the L ancestor nodes. Lookup: O(L) — traverse to the prefix node, return its cached top-K list. The cache at each node is built by propagating word frequencies up to ancestors during insertion or during an offline build phase. Update frequency: when a user selects a suggestion, increment its count and potentially update the top-K lists. In practice, frequencies are updated in batch (daily rebuild) rather than on every selection, since the rebuild propagates correctly through all ancestors. This gives O(L) autocomplete at the cost of O(V * K) extra memory (V = total vocabulary size).

Question 3

How does a Trie speed up Word Search II compared to checking each word individually?

Accepted Answer

Word Search II: given a board and a list of words, find all words present in the board. Naive approach: for each word, run a DFS from every cell — O(W * R * C * 4^L) where W=words, R*C=board size, L=max word length. For 100 words and a 10x10 board with 10-character words: 100 * 100 * 4^10 ≈ 10 trillion operations. Trie optimization: build a Trie of all target words first. Run DFS from each cell once, traversing the Trie simultaneously. At each DFS step, move to the child node corresponding to the current board character. If no child exists: prune immediately — no target word starts with this path prefix. If node.is_end: record the word as found. This single DFS over the board checks all words simultaneously. Pruning eliminates branches where no word can match. Effective complexity: O(R * C * 4^L) regardless of W (number of words) — the Trie amortizes the work across all words. Additional optimization: delete words from the Trie after finding them to avoid re-finding the same word from a different starting cell.

Trie (Prefix Tree) Interview Patterns: Autocomplete, Word Search, Wildcard

Trie (Prefix Tree) Interview Patterns

Basic Trie Implementation

Pattern 1: Autocomplete with Top-K Cache

Pattern 2: Word Search II (Multiple Words)

Pattern 3: Replace Words (Prefix Replacement)

Pattern 4: Design Add and Search Words (Wildcard)

When to Use a Trie

Interview Tips