Union-Find (Disjoint Set Union) Interview Patterns: Connected Components and Cycle Detection (2025)

Union-Find Data Structure

Union-Find (also called Disjoint Set Union, DSU) tracks which elements belong to the same connected component. Supports two operations in nearly O(1) amortized time: find(x): returns the representative (root) of x’s component. Two elements are in the same component if find(x) == find(y). union(x, y): merges the components containing x and y. Applications: detecting cycles in an undirected graph, finding connected components, Kruskal’s MST algorithm, network connectivity, grouping problems (accounts merge, friend circles).

Implementation with Path Compression and Union by Rank

class UnionFind:
    def __init__(self, n: int):
        self.parent = list(range(n))  # parent[i] = i (self-loop for roots)
        self.rank = [0] * n           # tree height upper bound
        self.components = n           # count of distinct components

    def find(self, x: int) -> int:
        # Path compression: flatten the tree
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # recursive compression
        return self.parent[x]

    def union(self, x: int, y: int) -> bool:
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return False  # already in same component (cycle detected!)
        # Union by rank: attach smaller tree under larger
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        self.components -= 1
        return True  # successfully merged

Complexity: with both optimizations, find and union are O(alpha(n)) amortized — alpha is the inverse Ackermann function, effectively O(1) for all practical input sizes (alpha(n) <= 4 for n < 10^600).

Number of Connected Components (LC 323) and Friend Circles (LC 547)

def count_components(n: int, edges: list[list[int]]) -> int:
    uf = UnionFind(n)
    for u, v in edges:
        uf.union(u, v)
    return uf.components  # starts at n, decremented on each successful union

def find_circle_num(is_connected: list[list[int]]) -> int:
    n = len(is_connected)
    uf = UnionFind(n)
    for i in range(n):
        for j in range(i + 1, n):
            if is_connected[i][j]:
                uf.union(i, j)
    return uf.components

Detecting Cycles in an Undirected Graph

def has_cycle(n: int, edges: list[list[int]]) -> bool:
    uf = UnionFind(n)
    for u, v in edges:
        if not uf.union(u, v):
            return True  # u and v already in same component = cycle
    return False

Accounts Merge (LC 721)

Group accounts that share at least one email. Two accounts that share an email belong to the same person. Map each email to an account index. Union accounts that share an email.

def accounts_merge(accounts: list[list[str]]) -> list[list[str]]:
    uf = UnionFind(len(accounts))
    email_to_account = {}  # email -> first account index that owns it

    for i, account in enumerate(accounts):
        for email in account[1:]:
            if email in email_to_account:
                uf.union(i, email_to_account[email])
            else:
                email_to_account[email] = i

    # Group emails by root account
    from collections import defaultdict
    root_emails = defaultdict(set)
    for email, idx in email_to_account.items():
        root_emails[uf.find(idx)].add(email)

    result = []
    for root, emails in root_emails.items():
        result.append([accounts[root][0]] + sorted(emails))
    return result

Kruskal’s MST with Union-Find

def kruskal_mst(n: int, edges: list[tuple]) -> int:
    # edges: list of (weight, u, v)
    edges.sort()  # sort by weight
    uf = UnionFind(n)
    mst_cost = 0
    edges_used = 0
    for weight, u, v in edges:
        if uf.union(u, v):  # if not already connected
            mst_cost += weight
            edges_used += 1
            if edges_used == n - 1:
                break  # MST complete (n-1 edges)
    return mst_cost if edges_used == n - 1 else -1  # -1 if not connected

See also: Meta Interview Prep

See also: Atlassian Interview Prep

See also: Databricks Interview Prep

Scroll to Top