Union-Find (Disjoint Set Union) Interview Patterns
Union-Find (DSU) solves the dynamic connectivity problem: given a graph where edges are added one at a time, efficiently answer “are nodes X and Y in the same connected component?” With path compression and union by rank, both operations run in nearly O(1) amortized time (technically O(α(n)) where α is the inverse Ackermann function — effectively constant for all practical inputs).
Implementation
class UnionFind:
def __init__(self, n: int):
self.parent = list(range(n)) # parent[i] = i initially (each node is its own root)
self.rank = [0] * n # rank[i] = upper bound on tree height
self.components = n # number of connected components
def find(self, x: int) -> int:
"""Find root of x's component with path compression."""
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression: point to root
return self.parent[x]
def union(self, x: int, y: int) -> bool:
"""
Merge components of x and y.
Returns True if they were in different components (new connection).
"""
root_x, root_y = self.find(x), self.find(y)
if root_x == root_y:
return False # already connected
# Union by rank: attach smaller tree under larger
if self.rank[root_x] bool:
return self.find(x) == self.find(y)
Time: O(α(n)) per operation. Space: O(n). The two optimizations work together: path compression flattens trees during find; union by rank keeps trees shallow during union.
Pattern 1: Number of Connected Components (LC 323, 547)
def count_components(n: int, edges: list[list[int]]) -> int:
uf = UnionFind(n)
for u, v in edges:
uf.union(u, v)
return uf.components # start at n, decremented on each successful union
Also solves LC 547 (Number of Provinces) and LC 684 (Redundant Connection). For graphs given as adjacency matrix: union connected pairs.
Pattern 2: Cycle Detection in Undirected Graph (LC 684)
def find_redundant_connection(edges: list[list[int]]) -> list[int]:
"""Return the last edge that creates a cycle."""
n = len(edges)
uf = UnionFind(n + 1) # 1-indexed
for u, v in edges:
if not uf.union(u, v): # union returns False = already connected = cycle
return [u, v]
return []
Cycle detected when union() returns False — both endpoints are already in the same component, so this edge creates a cycle.
Pattern 3: Kruskal’s Minimum Spanning Tree
MST connects all N nodes with N-1 edges of minimum total weight. Kruskal’s algorithm: sort all edges by weight, add each edge if it doesn’t create a cycle (union-find check).
def minimum_spanning_tree(n: int, edges: list[tuple]) -> int:
"""edges: list of (weight, u, v). Returns total MST weight."""
uf = UnionFind(n)
edges.sort() # sort by weight
total_weight = 0
edges_used = 0
for weight, u, v in edges:
if uf.union(u, v): # add edge if it connects two components
total_weight += weight
edges_used += 1
if edges_used == n - 1: # MST complete
break
return total_weight if edges_used == n - 1 else -1 # -1 if graph disconnected
LeetCode: 1584 (Min Cost to Connect All Points), 1135 (Connecting Cities With Minimum Cost).
Pattern 4: Accounts Merge (LC 721)
Group email accounts belonging to the same person. Two accounts are the same person if they share any email.
from collections import defaultdict
def accounts_merge(accounts: list[list[str]]) -> list[list[str]]:
email_to_id = {} # email -> first account index that owns it
uf = UnionFind(len(accounts))
for i, account in enumerate(accounts):
for email in account[1:]: # skip name at index 0
if email in email_to_id:
uf.union(i, email_to_id[email]) # same person
else:
email_to_id[email] = i
# Group emails by root account
root_to_emails = defaultdict(set)
for email, account_id in email_to_id.items():
root_to_emails[uf.find(account_id)].add(email)
# Build result
result = []
for root, emails in root_to_emails.items():
result.append([accounts[root][0]] + sorted(emails))
return result
Pattern 5: Number of Islands II (LC 305)
Dynamic connectivity: land cells are added one at a time; after each addition, report the count of islands.
def num_islands2(m: int, n: int, positions: list[list[int]]) -> list[int]:
uf = UnionFind(m * n)
grid = [[False] * n for _ in range(m)]
result = []
count = 0
for r, c in positions:
if grid[r][c]: # duplicate position
result.append(count)
continue
grid[r][c] = True
count += 1 # new island
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
nr, nc = r + dr, c + dc
if 0 <= nr < m and 0 <= nc < n and grid[nr][nc]:
if uf.union(r * n + c, nr * n + nc):
count -= 1 # merged two islands
result.append(count)
return result
When to Use Union-Find vs BFS/DFS
| Scenario | Use |
|---|---|
| Static graph, single connectivity query | BFS/DFS — simpler code |
| Dynamic graph (edges added online) | Union-Find — O(1) per update |
| Minimum spanning tree | Union-Find (Kruskal’s) |
| Cycle detection in undirected graph | Union-Find — cleaner than DFS |
| Number of components after N edge additions | Union-Find — track component count |
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What are path compression and union by rank in Union-Find?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Two optimizations that together make Union-Find nearly O(1) amortized per operation: (1) Path compression: during find(x), after traversing up to the root, set parent[node] = root for every node on the path. Future finds on these nodes jump directly to the root in O(1). In code: if parent[x] != x: parent[x] = find(parent[x]) — this one recursive line implements path compression. (2) Union by rank: when merging two trees, attach the shorter tree under the root of the taller tree. Prevents degenerate cases where the tree becomes a linked list (O(n) find). Track rank[root] = upper bound on tree height. When two equal-rank trees merge, increment the new root’s rank. Together, path compression + union by rank achieve O(α(n)) amortized time — inverse Ackermann function, effectively O(1) for all practical input sizes (α(n) ≤ 4 for n ≤ 10^600).”}},{“@type”:”Question”,”name”:”How do you use Union-Find to detect cycles in an undirected graph?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Process edges one at a time using union(). Before merging: find the roots of both endpoints. If the roots are the same, both nodes are already in the same component — adding this edge creates a cycle. If the roots differ, merge the components (no cycle yet). In code: for each edge (u, v): if find(u) == find(v): return “cycle found” (or return this edge for “Redundant Connection” problems); else: union(u, v). This works because Union-Find tracks connected components. A cycle exists if and only if we try to add an edge between two nodes that are already connected. Time: O(E * α(V)) where E = edges, V = vertices — essentially O(E). This is simpler than DFS-based cycle detection (which needs to track visited status and parent nodes) and works naturally with edge-list input format.”}},{“@type”:”Question”,”name”:”How does Kruskal’s algorithm use Union-Find to build a Minimum Spanning Tree?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Kruskal’s algorithm builds the MST greedily: sort all edges by weight (ascending), then iterate through edges adding each to the MST if it doesn’t create a cycle. Union-Find provides the cycle check in O(α(n)). Algorithm: sort edges by weight. Initialize Union-Find with n nodes. For each edge (u, v, weight): if not uf.connected(u, v): uf.union(u, v); mst_weight += weight; mst_edges.append((u,v,weight)); if len(mst_edges) == n-1: break (MST is complete). The MST has exactly n-1 edges for a connected graph of n nodes. Correctness: Kruskal’s is a special case of the greedy matroid intersection algorithm — the cut property guarantees that the minimum-weight edge crossing any cut is in the MST. Time: O(E log E) dominated by sorting. Kruskal’s is preferred over Prim’s when the graph is sparse (E << V^2); Prim's with a Fibonacci heap is preferred for dense graphs."}}]}
🏢 Asked at: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
🏢 Asked at: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture
🏢 Asked at: Apple Interview Guide 2026: iOS Systems, Hardware-Software Integration, and iCloud Architecture
🏢 Asked at: Coinbase Interview Guide
🏢 Asked at: Atlassian Interview Guide
Asked at: LinkedIn Interview Guide