What Is a Minimum Spanning Tree?
A Minimum Spanning Tree (MST) of a weighted undirected graph is a spanning tree (connects all V vertices) with the minimum total edge weight. Properties: always has exactly V-1 edges. May not be unique (if edge weights are equal). Cut property: the minimum weight edge crossing any cut is in some MST. Cycle property: the maximum weight edge in any cycle is not in any MST. Used for: network design (connect cities with minimum cable length), cluster analysis, image segmentation, approximation of TSP.
Kruskal Algorithm
Sort edges by weight. Use Union-Find to avoid cycles. For each edge (u,v,w) in sorted order: if find(u) != find(v): add to MST, union(u,v). Stop when V-1 edges are added. O(E log E) for sorting + O(E alpha(V)) for Union-Find = O(E log E). Preferred for sparse graphs (E close to V). Easy to implement with sorted edge list.
def kruskal(n, edges):
edges.sort(key=lambda e: e[2])
parent = list(range(n))
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]] # path halving
x = parent[x]
return x
def union(x, y):
rx, ry = find(x), find(y)
if rx == ry: return False
parent[rx] = ry; return True
cost = 0
for u, v, w in edges:
if union(u, v):
cost += w
return cost
Prim Algorithm
Greedy: grow the MST from a starting vertex. Priority queue of (weight, vertex). Initialize with (0, start). While queue: pop (w, u). If u is visited: skip. Mark visited, add w to MST cost. For each neighbor v of u: if not visited: push (weight(u,v), v). O((V+E) log V) with a binary heap. Preferred for dense graphs (adjacency matrix representation: O(V^2)). Conceptually similar to Dijkstra (replace distance with edge weight).
import heapq
def prim(graph, n):
visited = set()
heap = [(0, 0)] # (weight, node)
cost = 0
while heap and len(visited) < n:
w, u = heapq.heappop(heap)
if u in visited: continue
visited.add(u)
cost += w
for v, weight in graph[u]:
if v not in visited:
heapq.heappush(heap, (weight, v))
return cost
Maximum Spanning Tree
Find the spanning tree with maximum total weight. Two approaches: (1) Negate all edge weights, run Kruskal for MST, negate the result. (2) Run Kruskal in decreasing weight order (select the heaviest edge that does not form a cycle). Use cases: maximize reliability in a network (where edge weight = reliability), find the bottleneck spanning tree (max minimum-weight path between all pairs).
Common Interview Problems
- LC 1584 Min Cost to Connect All Points: MST on a complete graph (Prim with O(V^2) is optimal for dense graphs).
- LC 1135 Connecting Cities With Minimum Cost: Kruskal, same as classic MST.
- LC 1168 Optimize Water Distribution in a Village: add a virtual node for wells, run MST.
- LC 1489 Find Critical and Pseudo-Critical Edges: for each edge, check if it is in all MSTs (critical) or some MSTs (pseudo-critical).
- Second MST: remove each MST edge one at a time, find the min replacement edge — O(E^2).
Kruskal vs Prim
Kruskal: processes edges globally sorted by weight. Better for sparse graphs. Requires sorting (O(E log E)). Prim: grows from a vertex, processes neighbors. Better for dense graphs (O(V^2) with matrix). Both give the same MST result. In practice: Kruskal with Union-Find is simpler to code in interviews and covers most cases. Use Prim when given an adjacency matrix or when the graph is nearly complete (E close to V^2).
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the difference between Kruskal and Prim for minimum spanning tree?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Kruskal: sorts all edges globally by weight, then greedily adds the cheapest edge that does not form a cycle (checked with Union-Find). O(E log E) time (dominated by sorting). Works on any representation. Best for sparse graphs (E close to V). Prim: grows the MST from a starting vertex. Uses a priority queue to always extend with the cheapest edge connecting the MST to a non-MST vertex. O((V+E) log V) with a binary heap. O(V^2) with an adjacency matrix (better for dense graphs). Best for dense graphs or when given an adjacency matrix. Both algorithms produce the same MST (or one of the possible MSTs if there are equal-weight edges). In interviews: Kruskal is simpler to implement with Union-Find and handles disconnected graphs naturally (produces a minimum spanning forest).”
}
},
{
“@type”: “Question”,
“name”: “How does Kruskal algorithm use Union-Find to avoid cycles?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Kruskal adds edges in non-decreasing weight order. Adding an edge (u,v) creates a cycle if and only if u and v are already in the same connected component. Union-Find tracks connected components efficiently. For each candidate edge (u,v,w): find(u) and find(v) return the component roots. If roots are equal: u and v are connected — skip this edge (would form a cycle). If roots differ: this edge connects two different components — add to MST, call union(u,v) to merge the components. This is the cycle property of MSTs: the cheapest edge crossing any cut (between two components) is always in the MST. Union-Find with path compression and union by rank gives O(alpha(V)) amortized per operation. Total time: O(E log E) sorting + O(E alpha(V)) union-find = O(E log E).”
}
},
{
“@type”: “Question”,
“name”: “How do you find the minimum spanning tree of a complete graph efficiently?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A complete graph has E = V*(V-1)/2 edges. Running Kruskal on all edges is O(V^2 log V). Prim with an adjacency matrix is O(V^2) — optimal for complete graphs. LC 1584 Min Cost to Connect All Points is a classic example: given V points on a 2D plane, connect all with minimum total Manhattan distance. There are V^2/2 potential edges. Prim O(V^2): maintain min_dist[v] = minimum edge weight from v to the current MST. In each of V iterations, find the unvisited vertex with minimum min_dist (O(V) scan), add it to the MST, and update min_dist for its neighbors. Total: O(V^2). This beats Kruskal O(V^2 log V) for dense graphs. For the specific case of Euclidean MST (points in a plane), there are O(V log V) algorithms using Delaunay triangulation.”
}
},
{
“@type”: “Question”,
“name”: “What is the cut property and why does it prove MST algorithms are correct?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The cut property states: for any cut of the graph (partition of vertices into two non-empty sets S and V-S), the minimum weight edge crossing the cut is in some MST. Proof: suppose edge e = (u,v) is the minimum crossing edge but not in the MST T. T must have a path from u to v (since T is spanning). This path contains some edge e prime crossing the cut. Adding e to T creates a cycle; removing e prime breaks the cycle, giving a tree T prime with weight T – weight(e prime) + weight(e) < weight(T) (since e is cheaper than e prime). Contradiction — T was not the MST. Why it proves Kruskal: when Kruskal adds an edge (u,v), the cut is (MST so far, remaining vertices). The edge (u,v) is the minimum crossing edge for some cut. Why it proves Prim: at each step, Prim adds the minimum edge from the current tree to any non-tree vertex."
}
},
{
"@type": "Question",
"name": "What problems reduce to minimum spanning tree?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Network design: connect N cities with minimum cable length (classic MST). Find the minimum cost to make a graph connected. Cluster analysis: run MST, remove the K-1 most expensive edges to get K clusters (single-linkage hierarchical clustering). LC 1168 Water Distribution: add a virtual well node with edge weight = well digging cost; MST gives the optimal combination of wells and pipes. Bottleneck spanning tree: a spanning tree that minimizes the maximum edge weight. The MST is also the bottleneck spanning tree (the min bottleneck path between any two vertices in the MST equals the max edge weight on the MST path — a property of MSTs). Max reliability spanning tree: maximize the product of edge reliabilities — equivalent to maximum spanning tree on log(reliability) weights. Approximation algorithm: the MST is a 2-approximation for metric TSP (MST weight <= 2 * optimal TSP)."
}
}
]
}
Asked at: Meta Interview Guide
Asked at: Apple Interview Guide
Asked at: Coinbase Interview Guide
Asked at: Databricks Interview Guide