Trie (Prefix Tree) Interview Patterns
A Trie (prefix tree) is a tree where each node represents a character in a string, and paths from root to leaf represent words. It enables prefix-based search in O(L) time (L = string length) regardless of dictionary size, making it ideal for autocomplete, spell checking, and IP routing.
Basic Trie Implementation
class TrieNode:
def __init__(self):
self.children = {} # char → TrieNode
self.is_end = False # marks end of a word
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for ch in word:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.is_end = True
def search(self, word):
node = self.root
for ch in word:
if ch not in node.children: return False
node = node.children[ch]
return node.is_end
def starts_with(self, prefix):
node = self.root
for ch in prefix:
if ch not in node.children: return False
node = node.children[ch]
return True
Time: O(L) for insert, search, and starts_with (L = length of word/prefix). Space: O(total characters across all inserted words). The dict-based children supports arbitrary alphabets; use a fixed array [None]*26 for lowercase English only (faster lookups).
Pattern 1: Autocomplete with Top-K Cache
class AutocompleteNode:
def __init__(self):
self.children = {}
self.top_k = [] # cached top-K suggestions for this prefix
def update_top_k(self, word, score, k=10):
self.top_k.append((score, word))
self.top_k.sort(reverse=True)
self.top_k = self.top_k[:k]
def get_suggestions(root, prefix):
node = root
for ch in prefix:
if ch not in node.children: return []
node = node.children[ch]
return [word for _, word in node.top_k]
Each node caches the top-K most popular words with the given prefix. Lookup is O(L) — traverse to the prefix node, return cached results. Update on insert propagates scores up to ancestors, O(L * K log K).
Pattern 2: Word Search II (Multiple Words)
def find_words(board, words):
trie = Trie()
for w in words: trie.insert(w)
result = set()
rows, cols = len(board), len(board[0])
def dfs(node, r, c, path):
ch = board[r][c]
if ch not in node.children: return
next_node = node.children[ch]
path += ch
if next_node.is_end: result.add(path)
board[r][c] = '#' # mark visited
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
nr, nc = r+dr, c+dc
if 0 <= nr < rows and 0 <= nc < cols and board[nr][nc] != '#':
dfs(next_node, nr, nc, path)
board[r][c] = ch # restore
for r in range(rows):
for c in range(cols):
dfs(trie.root, r, c, '')
return list(result)
Using a Trie instead of a set for word lookup lets us prune the DFS early: if the current path prefix is not in the Trie, stop immediately. This is significantly faster than checking each word individually for large word lists.
Pattern 3: Replace Words (Prefix Replacement)
def replace_words(dictionary, sentence):
trie = Trie()
for root_word in dictionary:
trie.insert(root_word)
def find_root(word):
node = trie.root
for i, ch in enumerate(word):
if node.is_end: return word[:i]
if ch not in node.children: return word
node = node.children[ch]
return word
return ' '.join(find_root(w) for w in sentence.split())
Pattern 4: Design Add and Search Words (Wildcard)
class WordDictionary:
def __init__(self): self.root = TrieNode()
def add_word(self, word):
node = self.root
for ch in word:
node.children.setdefault(ch, TrieNode())
node = node.children[ch]
node.is_end = True
def search(self, word):
def dfs(node, i):
if i == len(word): return node.is_end
ch = word[i]
if ch == '.': # wildcard: try all children
return any(dfs(child, i+1) for child in node.children.values())
if ch not in node.children: return False
return dfs(node.children[ch], i+1)
return dfs(self.root, 0)
When to Use a Trie
- Prefix queries: autocomplete, IP routing (longest prefix match)
- Multiple word search on a board (Word Search II)
- Prefix replacement or word segmentation
- Wildcard / regex matching against a dictionary
Interview Tips
- Trie beats hash set for prefix queries — hash set requires O(N) scan for prefix match.
- Top-K cache at each node trades memory for O(L) autocomplete lookups.
- Word Search II: the early pruning from Trie is the key optimization over brute force.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “When should you use a Trie instead of a hash set for string problems?”,
“acceptedAnswer”: { “@type”: “Answer”, “text”: “A hash set answers "does this exact string exist?" in O(L) amortized (L=string length for hashing). A Trie answers the same query in O(L) but also answers prefix queries: "does any word start with this prefix?" and "find all words with this prefix." Hash set has no prefix query capability — you would need to iterate all stored strings, O(N*L). Use Trie when: (1) prefix queries are needed (autocomplete, IP routing, spell check). (2) Multiple words need to be searched simultaneously in a structure (Word Search II: building a Trie of all target words enables early DFS pruning when the current path prefix matches no word). (3) You need to find words by wildcard patterns (Trie DFS with backtracking at "." nodes). Hash set when: (1) only exact lookups needed. (2) Memory is a concern (hash set stores strings compactly; Trie uses a node per character with child pointers). Memory comparison: 1000 words of average 5 chars — hash set: ~5000 chars; Trie: up to 5000 nodes with child dicts — more memory but enables O(prefix_length) prefix queries.” }
},
{
“@type”: “Question”,
“name”: “How do you implement autocomplete with O(prefix_length) lookup using a Trie?”,
“acceptedAnswer”: { “@type”: “Answer”, “text”: “Standard Trie: traverse to the prefix node in O(L), then DFS the entire subtree to collect all words — O(N) for N words with that prefix. For large dictionaries, this is slow. Optimization: cache top-K popular words at every Trie node. Each node stores a list of the top-K (e.g., 10) most frequently searched completions that pass through it. Insert: O(L * K log K) — update the top-K list at each of the L ancestor nodes. Lookup: O(L) — traverse to the prefix node, return its cached top-K list. The cache at each node is built by propagating word frequencies up to ancestors during insertion or during an offline build phase. Update frequency: when a user selects a suggestion, increment its count and potentially update the top-K lists. In practice, frequencies are updated in batch (daily rebuild) rather than on every selection, since the rebuild propagates correctly through all ancestors. This gives O(L) autocomplete at the cost of O(V * K) extra memory (V = total vocabulary size).” }
},
{
“@type”: “Question”,
“name”: “How does a Trie speed up Word Search II compared to checking each word individually?”,
“acceptedAnswer”: { “@type”: “Answer”, “text”: “Word Search II: given a board and a list of words, find all words present in the board. Naive approach: for each word, run a DFS from every cell — O(W * R * C * 4^L) where W=words, R*C=board size, L=max word length. For 100 words and a 10×10 board with 10-character words: 100 * 100 * 4^10 ≈ 10 trillion operations. Trie optimization: build a Trie of all target words first. Run DFS from each cell once, traversing the Trie simultaneously. At each DFS step, move to the child node corresponding to the current board character. If no child exists: prune immediately — no target word starts with this path prefix. If node.is_end: record the word as found. This single DFS over the board checks all words simultaneously. Pruning eliminates branches where no word can match. Effective complexity: O(R * C * 4^L) regardless of W (number of words) — the Trie amortizes the work across all words. Additional optimization: delete words from the Trie after finding them to avoid re-finding the same word from a different starting cell.” }
}
]
}