Trie and String Matching Interview Patterns: Autocomplete, KMP, Rabin-Karp

Trie and String Matching Interview Patterns

Tries (prefix trees) and string matching algorithms are a distinct category in coding interviews. They appear in autocomplete, spell checker, word search, and IP routing problems. This guide covers the Trie data structure, its key operations, and the classic string matching algorithms you need to know.

Trie Data Structure

A Trie stores strings by their prefixes. Each node represents one character; the path from root to a node spells out a prefix.

class TrieNode:
    def __init__(self):
        self.children = {}   # char -> TrieNode
        self.is_end = False  # marks end of a word

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True

    def search(self, word: str) -> bool:
        node = self.root
        for ch in word:
            if ch not in node.children: return False
            node = node.children[ch]
        return node.is_end

    def starts_with(self, prefix: str) -> bool:
        node = self.root
        for ch in prefix:
            if ch not in node.children: return False
            node = node.children[ch]
        return True

Time: O(m) for insert/search/prefix, where m = word length. Space: O(total characters across all words).

Autocomplete (Design Search Autocomplete System — LeetCode 642)

Each TrieNode stores the top-3 matching sentences for its prefix:

class AutocompleteSystem:
    def __init__(self, sentences, times):
        self.root = TrieNode()
        self.curr_input = []
        for s, t in zip(sentences, times):
            self._insert(s, t)

    def _insert(self, s, count):
        node = self.root
        for ch in s:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
            # keep top 3 by count at each node
            node.top3 = sorted(node.top3 + [(count, s)],
                               key=lambda x: (-x[0], x[1]))[:3]
        node.is_end = True

    def input(self, c):
        if c == '#':
            self._insert(''.join(self.curr_input), 1)
            self.curr_input = []
            return []
        self.curr_input.append(c)
        node = self.root
        for ch in self.curr_input:
            if ch not in node.children: return []
            node = node.children[ch]
        return [s for _, s in node.top3]

Word Search II (LeetCode 212) — Trie + Backtracking

Find all words from a dictionary that exist in a 2D grid. Build a Trie from the dictionary, then DFS from each cell:

def find_words(board, words):
    trie = Trie()
    for w in words: trie.insert(w)
    result = set()
    rows, cols = len(board), len(board[0])

    def dfs(node, r, c, path):
        if node.is_end: result.add(path)
        if not (0 <= r < rows and 0 <= c < cols): return
        ch = board[r][c]
        if ch not in node.children or ch == '#': return
        board[r][c] = '#'                  # mark visited
        next_node = node.children[ch]
        for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
            dfs(next_node, r+dr, c+dc, path+ch)
        board[r][c] = ch                   # restore

    for r in range(rows):
        for c in range(cols):
            dfs(trie.root, r, c, '')
    return list(result)
# Trie pruning avoids re-exploring dead-end prefixes

KMP (Knuth-Morris-Pratt) Pattern Matching

Find all occurrences of pattern P in text T in O(n+m) time (vs O(n*m) naive).

def kmp_search(text, pattern):
    def build_lps(pattern):
        # lps[i] = length of longest proper prefix of pattern[:i+1]
        # that is also a suffix
        lps = [0] * len(pattern)
        length, i = 0, 1
        while i < len(pattern):
            if pattern[i] == pattern[length]:
                length += 1
                lps[i] = length
                i += 1
            elif length:
                length = lps[length - 1]  # don't increment i
            else:
                lps[i] = 0
                i += 1
        return lps

    lps = build_lps(pattern)
    matches = []
    i = j = 0   # i = text index, j = pattern index
    while i < len(text):
        if text[i] == pattern[j]:
            i += 1; j += 1
        if j == len(pattern):
            matches.append(i - j)
            j = lps[j - 1]
        elif i < len(text) and text[i] != pattern[j]:
            if j: j = lps[j - 1]
            else: i += 1
    return matches

The LPS (Longest Proper Prefix which is also Suffix) array lets KMP skip redundant comparisons by reusing already-matched prefix information.

Rabin-Karp Rolling Hash

Use hashing to find pattern occurrences in O(n+m) average. Useful for multi-pattern search.

def rabin_karp(text, pattern):
    BASE, MOD = 26, 10**9 + 7
    n, m = len(text), len(pattern)
    if m > n: return []

    def char_val(c): return ord(c) - ord('a')

    # Compute hash of pattern and first window
    p_hash = t_hash = 0
    high = pow(BASE, m-1, MOD)
    for i in range(m):
        p_hash = (p_hash * BASE + char_val(pattern[i])) % MOD
        t_hash = (t_hash * BASE + char_val(text[i])) % MOD

    matches = []
    for i in range(n - m + 1):
        if t_hash == p_hash and text[i:i+m] == pattern:  # verify to avoid hash collision
            matches.append(i)
        if i < n - m:
            t_hash = (t_hash - char_val(text[i]) * high) % MOD
            t_hash = (t_hash * BASE + char_val(text[i+m])) % MOD
    return matches

IP Address Routing (Longest Prefix Match)

Routers use a bitwise Trie to find the longest matching prefix for an IP address. Each bit of the IP (0 or 1) is one level in the Trie. This is how CIDR routing tables work — lookup is O(32) for IPv4, O(128) for IPv6.

Common Trie Problems

  • LeetCode 208: Implement Trie
  • LeetCode 211: Design Add and Search Words (wildcard '.' matching)
  • LeetCode 212: Word Search II (Trie + backtracking)
  • LeetCode 642: Design Search Autocomplete System
  • LeetCode 648: Replace Words (prefix replacement)
  • LeetCode 720: Longest Word in Dictionary

Interview Tips

  • Always implement Trie with a dict of children, not a fixed 26-char array — handles Unicode, faster to write
  • For Word Search II, the key insight is using Trie instead of a set to enable prefix pruning
  • KMP is often asked conceptually — explain the LPS table intuition even if you can't code it perfectly under pressure
  • Rabin-Karp is the answer when you need to find multiple patterns simultaneously (K patterns, O(n+Km) total)
  • For autocomplete system design, mention storing top-K results at each Trie node to avoid DFS on every query

  • Airbnb Interview Guide
  • Uber Interview Guide
  • Atlassian Interview Guide
  • LinkedIn Interview Guide
  • Apple Interview Guide
  • Meta Interview Guide
  • Scroll to Top