Minimum Spanning Tree: Kruskal’s and Prim’s Algorithms
A Minimum Spanning Tree (MST) connects all vertices of a weighted graph with the minimum total edge weight, using exactly V-1 edges and no cycles. MST algorithms appear in network design, cluster analysis, and approximation of NP-hard problems like TSP.
When to Use MST
- Minimum cost to connect N cities with roads
- Network cable routing (minimize wire length)
- Clustering (remove maximum weight edges from MST)
- Approximate TSP (MST gives 2x approximation)
- Critical connections in a network
Kruskal’s Algorithm — Edge-Centric
Sort all edges by weight. Greedily add the cheapest edge that does not create a cycle. Use Union-Find to detect cycles in O(α(n)) ≈ O(1). Total time: O(E log E) dominated by sorting.
def kruskal(n, edges):
# edges: list of (weight, u, v)
edges.sort()
parent = list(range(n))
rank = [0] * n
def find(x):
if parent[x] != x:
parent[x] = find(parent[x]) # path compression
return parent[x]
def union(x, y):
px, py = find(x), find(y)
if px == py: return False # same component = cycle
if rank[px] < rank[py]: px, py = py, px
parent[py] = px
if rank[px] == rank[py]: rank[px] += 1
return True
mst_weight = 0
mst_edges = []
for w, u, v in edges:
if union(u, v):
mst_weight += w
mst_edges.append((u, v, w))
if len(mst_edges) == n - 1:
break # MST complete
return mst_weight, mst_edges
Prim’s Algorithm — Vertex-Centric
Start from any vertex. Greedily grow the MST by adding the cheapest edge that connects a new vertex to the current tree. Use a min-heap. Time: O((V+E) log V).
import heapq
def prim(n, adj):
# adj[u] = list of (weight, v)
visited = set()
heap = [(0, 0)] # (weight, vertex), start at vertex 0
mst_weight = 0
while heap and len(visited) < n:
w, u = heapq.heappop(heap)
if u in visited:
continue
visited.add(u)
mst_weight += w
for edge_w, v in adj[u]:
if v not in visited:
heapq.heappush(heap, (edge_w, v))
return mst_weight if len(visited) == n else -1 # -1 if disconnected
Kruskal vs. Prim Comparison
| Dimension | Kruskal’s | Prim’s |
|---|---|---|
| Approach | Edge-based: sort all edges | Vertex-based: grow from one node |
| Data structure | Union-Find | Min-heap |
| Time complexity | O(E log E) | O((V+E) log V) |
| Best for | Sparse graphs (E ≈ V) | Dense graphs (E ≈ V²) |
| Implementation | Easier with sorted edge list | Easier with adjacency list |
Classic MST Interview Problems
# Minimum Cost to Connect All Points (LeetCode 1584)
# Points on a 2D grid, edge weight = Manhattan distance
import heapq
def min_cost_connect_points(points):
n = len(points)
visited = set()
heap = [(0, 0)] # (cost, point_index)
total = 0
while len(visited) < n:
cost, i = heapq.heappop(heap)
if i in visited:
continue
visited.add(i)
total += cost
for j in range(n):
if j not in visited:
d = abs(points[i][0]-points[j][0]) + abs(points[i][1]-points[j][1])
heapq.heappush(heap, (d, j))
return total
Related: Critical Connections (Bridges)
A bridge is an edge whose removal disconnects the graph. Use Tarjan’s algorithm — DFS with discovery time and low-link values:
def critical_connections(n, connections):
from collections import defaultdict
graph = defaultdict(list)
for u, v in connections:
graph[u].append(v)
graph[v].append(u)
disc = [-1] * n
low = [0] * n
timer = [0]
result = []
def dfs(node, parent):
disc[node] = low[node] = timer[0]
timer[0] += 1
for neighbor in graph[node]:
if neighbor == parent: continue
if disc[neighbor] == -1:
dfs(neighbor, node)
low[node] = min(low[node], low[neighbor])
if low[neighbor] > disc[node]:
result.append([node, neighbor])
else:
low[node] = min(low[node], disc[neighbor])
dfs(0, -1)
return result
Interview Tips
- Kruskal’s is easier to explain: sort edges, union-find to check cycles
- Prim’s is more natural when you have an adjacency list (similar to Dijkstra)
- MST guarantee: greedy choice property — the minimum weight edge crossing any cut is always in some MST
- Minimum spanning forest: run MST on disconnected graphs — each component gets its own MST