Why Hash Maps Are Fundamental
Hash maps (dictionaries in Python, HashMaps in Java) provide O(1) average-case lookup, insertion, and deletion. They are the most frequently used data structure in coding interviews after arrays and strings. Mastering hash map patterns transforms O(n²) brute-force solutions into O(n) optimal ones. Nearly every frequency counting, lookup optimization, or complement-finding problem uses a hash map.
Two Sum Pattern (Complement Lookup)
The canonical example: given an array, find two elements that sum to a target. Brute force: O(n²). Hash map: O(n).
def twoSum(nums, target):
seen = {} # value → index
for i, num in enumerate(nums):
complement = target - num
if complement in seen:
return [seen[complement], i]
seen[num] = i
For each element, check if its complement was already seen — if yes, found the pair. If no, store it for future elements. Generalizations: Three Sum → fix one element, apply two-sum on the remaining array. Four Sum → fix two elements, apply two-sum.
Frequency Counting
Count occurrences of each element: defaultdict(int) or Counter. Applications:
- Valid Anagram: count character frequencies in both strings, compare maps. O(n).
- Group Anagrams: sort each string as a key; group strings with the same key. O(n × k log k) where k = max string length.
- Top K Frequent Elements: count frequencies, then heap of size k, or bucket sort by frequency (O(n)).
- Majority Element: Boyer-Moore voting is O(1) space, but frequency map is O(n) and more intuitive.
- Subarray Sum Equals K (LC 560): prefix sums + frequency map. prefix_count[prefix_sum – k] gives the number of subarrays ending at current index that sum to k.
Prefix Sum + Hash Map
A prefix sum array enables range sum queries in O(1). Combined with a hash map, it finds subarrays with specific sum properties in O(n).
def subarraySum(nums, k):
count = 0
prefix_sum = 0
prefix_count = {0: 1} # empty subarray has sum 0
for num in nums:
prefix_sum += num
count += prefix_count.get(prefix_sum - k, 0)
prefix_count[prefix_sum] = prefix_count.get(prefix_sum, 0) + 1
return count
Key insight: subarray sum from index i to j = prefix_sum[j] – prefix_sum[i-1]. If we want this to equal k, then prefix_sum[i-1] = prefix_sum[j] – k. For each prefix_sum[j], count how many previous prefix sums equal prefix_sum[j] – k.
Sliding Window with Hash Map
Variable-size sliding window problems often use a hash map to track the window’s state. Pattern: expand the right pointer; shrink the left pointer when the window violates the constraint.
Longest Substring Without Repeating Characters: hash map stores last seen index of each character. When a character is repeated, advance left pointer to max(left, last_seen[char] + 1). Window size = right – left + 1.
Minimum Window Substring (LC 76): hash map tracks character counts in the window and the target. Expand right until all target characters are covered; shrink left to minimize the window while keeping it valid.
Fruit Into Baskets (LC 904): longest subarray with at most 2 distinct values — sliding window with a frequency map. When distinct count > 2, shrink left until distinct count ≤ 2.
Counting Pairs / Difference
K-diff Pairs in an Array (LC 532): count pairs with absolute difference k. For k=0: find duplicates using frequency count. For k>0: for each num, check if (num + k) is in the hash set.
Find All Anagrams in a String (LC 438): fixed-size sliding window with character frequency comparison. Maintain a window frequency map; compare against pattern frequency map using a “match count” variable instead of dict comparison each step.
Hashing for Grouping
Group Anagrams (LC 49): key = tuple(sorted(word)). All anagrams have the same sorted representation → same key → same group.
Isomorphic Strings (LC 205): maintain two maps (s→t char mapping and t→s mapping). A bijection must exist — check both directions.
Word Pattern (LC 290): same pattern as isomorphic strings but mapping words to letters.
LRU Cache (LC 146): hash map (key → doubly linked list node) + doubly linked list (maintains access order). Hash map enables O(1) get; doubly linked list enables O(1) move to front on access and O(1) eviction from back.
Interview Tips
- When you need O(1) lookup of “have I seen X before” → hash set (not hash map — simpler)
- When you need to store a value associated with each seen element → hash map
- For complement problems: store elements seen so far, check complement on each new element
- For frequency problems: Counter(arr) gives a dict of frequencies in one line
- Prefix sum + hash map = O(n) for most “subarray with property X” problems
- defaultdict(int) avoids KeyError on increment; Counter has most_common() for top-K