Q: How do you detect a cycle in an undirected graph using Union-Find?

Process edges one by one. For each edge (u, v): find the root of u and the root of v. If they have the same root, u and v are already in the same connected component u2014 adding edge (u, v) creates a cycle. Return True (cycle detected). If they have different roots, they are in different components u2014 merge them with union(u, v). If all edges are processed without finding same-root endpoints, the graph is acyclic. This is used in LC 684 (Redundant Connection) to find the specific edge that creates the cycle u2014 return the first edge where find(u) == find(v). Time O(E u00d7 u03b1(V)), essentially O(E). This approach is more efficient than DFS cycle detection for sparse graphs with many connectivity queries, and is the basis for Kruskal's Minimum Spanning Tree algorithm (which adds edges greedily, skipping any edge that would create a cycle).

Q: What is the difference between Union-Find and BFS/DFS for graph connectivity problems?

Union-Find and BFS/DFS both solve graph connectivity but have different strengths. Union-Find excels at: dynamic connectivity (edges are added over time and you need connectivity queries after each addition), multiple connectivity queries on a static graph (O(u03b1(n)) per query after O(E u00d7 u03b1(n)) preprocessing), and problems where you're merging components (accounts merge, minimum spanning tree). BFS/DFS excels at: finding the actual path between two nodes, computing shortest distances, detecting cycles in directed graphs (Union-Find works only for undirected), topological sorting, and bipartite checking. For a one-time "how many connected components" query on a static graph, both approaches are O(V + E) u2014 use whichever is simpler to implement. For "is X connected to Y after adding edge Z?", Union-Find is clearly better: O(u03b1(n)) per query vs O(V + E) per query for BFS/DFS on the updated graph.

Question 1

What is path compression in Union-Find and why does it matter?

Accepted Answer

Path compression is an optimization in the find() operation that flattens the tree structure after each lookup. When find(x) is called, it recursively finds the root of x's component. Without path compression, all nodes on the path from x to root remain at their original positions u2014 future finds on the same path traverse the same chain. With path compression, after finding the root, every node visited on the path is updated to point directly to the root: self.parent[x] = self.find(self.parent[x]). This two-pass approach (find root, then flatten) makes subsequent calls on any node in the path O(1). Combined with union by rank (always attach the shallower tree under the deeper tree), the amortized time per operation becomes O(u03b1(n)), the inverse Ackermann function u2014 effectively constant for all practical values of n (u03b1(n) u2264 4 for n < 10^600). Without these optimizations, a chain of union operations creates a linked list with O(n) find time.

Question 2

How do you detect a cycle in an undirected graph using Union-Find?

Accepted Answer

Process edges one by one. For each edge (u, v): find the root of u and the root of v. If they have the same root, u and v are already in the same connected component u2014 adding edge (u, v) creates a cycle. Return True (cycle detected). If they have different roots, they are in different components u2014 merge them with union(u, v). If all edges are processed without finding same-root endpoints, the graph is acyclic. This is used in LC 684 (Redundant Connection) to find the specific edge that creates the cycle u2014 return the first edge where find(u) == find(v). Time O(E u00d7 u03b1(V)), essentially O(E). This approach is more efficient than DFS cycle detection for sparse graphs with many connectivity queries, and is the basis for Kruskal's Minimum Spanning Tree algorithm (which adds edges greedily, skipping any edge that would create a cycle).

Question 3

What is the difference between Union-Find and BFS/DFS for graph connectivity problems?

Accepted Answer

Union-Find and BFS/DFS both solve graph connectivity but have different strengths. Union-Find excels at: dynamic connectivity (edges are added over time and you need connectivity queries after each addition), multiple connectivity queries on a static graph (O(u03b1(n)) per query after O(E u00d7 u03b1(n)) preprocessing), and problems where you're merging components (accounts merge, minimum spanning tree). BFS/DFS excels at: finding the actual path between two nodes, computing shortest distances, detecting cycles in directed graphs (Union-Find works only for undirected), topological sorting, and bipartite checking. For a one-time "how many connected components" query on a static graph, both approaches are O(V + E) u2014 use whichever is simpler to implement. For "is X connected to Y after adding edge Z?", Union-Find is clearly better: O(u03b1(n)) per query vs O(V + E) per query for BFS/DFS on the updated graph.

Union-Find (Disjoint Set Union) Interview Patterns

What Is Union-Find?

Implementation with Path Compression + Union by Rank

Classic Problems

Weighted Union-Find

When to Use Union-Find vs BFS/DFS

Interview Tips

Companies That Ask This