Minimum Spanning Trees

Explores algorithms for finding minimum spanning trees (MSTs) in graphs, including Kruskal's algorithm and Prim's algorithm. Discusses their correctness and performance.


Minimum Spanning Tree (MST) Algorithms: Correctness and Proofs

Correctness of MST Algorithms: An Overview

A Minimum Spanning Tree (MST) of a connected, undirected graph is a spanning tree with the smallest possible total edge weight. The correctness of MST algorithms hinges on the properties that guarantee the inclusion of certain edges and the exclusion of others, ultimately leading to a tree with minimal total weight. Both Kruskal's and Prim's algorithms rely on fundamental graph theory principles related to cycles and cuts.

The core idea behind the correctness of these greedy algorithms is to iteratively add edges that are "safe". A safe edge is an edge that can be added to the current set of edges that will eventually form the MST, *without* violating the MST properties (i.e., maintaining a tree structure and minimizing the total weight).

Cycle and Cut Properties of MSTs

Cycle Property

The cycle property states: For any cycle in the graph, if the edge *e* with the largest weight in the cycle is *not* already in the current MST, then *e* can be safely removed from the graph without affecting the existence of an MST. Conversely, if the current edge set is part of an MST and you add an edge *e* that forms a cycle, then *e* must have the largest weight of edges forming that cycle.

Cut Property

The cut property states: Given any cut (a partition of the graph's vertices into two disjoint sets), the edge with the minimum weight that crosses the cut must be part of the MST. A crossing edge is an edge that connects a vertex in one set of the cut to a vertex in the other set.

Kruskal's Algorithm

Algorithm Description

  1. Sort all the edges in the graph in non-decreasing order of their weights.
  2. Initialize an empty set to store the edges of the MST.
  3. Iterate through the sorted edges:
    • For each edge (u, v), check if adding it to the MST set creates a cycle.
    • If adding (u, v) does *not* create a cycle, add it to the MST set.
  4. The MST set now contains the edges of the MST.

Kruskal's algorithm relies heavily on the cycle property. A disjoint set data structure (Union-Find) is commonly used to efficiently detect cycles.

Proof of Correctness (Kruskal's Algorithm)

We will prove the correctness of Kruskal's algorithm using induction on the number of edges added to the MST.

Base Case: Initially, the MST set is empty, which is trivially a subset of some MST.

Inductive Hypothesis: Assume that after *k* edges have been added to the MST set, the resulting set of edges, *A*, is a subset of some MST *T*.

Inductive Step: We need to show that adding the (k+1)th edge *e* = (u, v), chosen by Kruskal's algorithm, to *A* results in a set that is still a subset of some MST.

Kruskal's algorithm only adds *e* to *A* if it does not create a cycle. Suppose adding *e* creates a cycle in the graph. Since *e* was the lowest-weight edge that *could* have created the cycle *at this point* in the iteration, and the cycle must exist in order to create the condition of adding *e*; all other edges in that cycle have been selected by Kruskal before *e*.

Now consider two cases.

  • Case 1: If e=(u,v) *is* in T, then A ∪ {e} ⊆ T. The hypothesis holds.
  • Case 2: If e=(u,v) *is not* in T, adding *e* to *T* must create a cycle *C*. Since *e* is not in *T*, at least one edge *f* in *C* must *also* not be in *A*. By the cycle property, w(e) >= w(f). But since Kruskal's algorithm considered *e* before *f*, it must be that w(e) <= w(f). Therefore, w(e) = w(f). Now create a new spanning tree T' = T - {f} + {e}. T' is a spanning tree and w(T') = w(T) - w(f) + w(e) = w(T). Thus T' is also an MST. Also A ∪ {e} ⊆ T'. Therefore, the hypothesis holds.

Conclusion: By the principle of mathematical induction, Kruskal's algorithm correctly constructs an MST.

Prim's Algorithm

Algorithm Description

  1. Choose an arbitrary starting vertex.
  2. Initialize an empty set to store the edges of the MST.
  3. Maintain a set of visited vertices.
  4. While there are unvisited vertices:
    • Find the minimum-weight edge that connects a visited vertex to an unvisited vertex.
    • Add this edge to the MST set.
    • Mark the newly reached vertex as visited.
  5. The MST set now contains the edges of the MST.

Prim's algorithm directly grows a single tree from a starting vertex. A priority queue is commonly used to efficiently find the minimum-weight edge connecting visited and unvisited vertices.

Proof of Correctness (Prim's Algorithm)

We will prove the correctness of Prim's algorithm using induction on the number of vertices in the tree being constructed.

Base Case: The single starting vertex is trivially part of some MST.

Inductive Hypothesis: Assume that after *k* vertices are in the tree, the resulting tree *A* is a subtree of some MST *T*.

Inductive Step: We need to show that adding the (k+1)th vertex and its connecting edge *e* = (u, v), chosen by Prim's algorithm, to *A* results in a subtree that is still a subset of some MST. Here, *u* is a vertex in *A* and *v* is a vertex not in *A*.

Now consider two cases.

  • Case 1: If e=(u,v) *is* in T, then A ∪ {e} ⊆ T. The hypothesis holds.
  • Case 2: If e=(u,v) *is not* in T, adding *e* to *T* must create a cycle *C*. Since *e* connects a vertex in the subtree *A* to a vertex not in *A*, and since T is a spanning tree, at least one other edge *f* in *C* must also connect a vertex in *A* to a vertex not in *A*. Since Prim's algorithm chose *e*, it must be that w(e) <= w(f). Now create a new spanning tree T' = T - {f} + {e}. T' is a spanning tree and w(T') = w(T) - w(f) + w(e) <= w(T). Since T is an MST, T' must also be an MST, and has total weight the same as T. Therefore A ∪ {e} ⊆ T', so the hypothesis holds.

Conclusion: By the principle of mathematical induction, Prim's algorithm correctly constructs an MST.