Minimum Spanning Trees

Explores algorithms for finding minimum spanning trees (MSTs) in graphs, including Kruskal's algorithm and Prim's algorithm. Discusses their correctness and performance.


Prim's Algorithm for Finding Minimum Spanning Trees

Introduction

Prim's algorithm is a greedy algorithm used to find a minimum spanning tree (MST) for a weighted, undirected graph. A minimum spanning tree connects all vertices in the graph with the minimum possible total edge weight and forms a tree (no cycles).

Explanation of Prim's Algorithm

Prim's algorithm works by iteratively building the MST, starting from an arbitrary vertex. At each step, it adds the minimum-weight edge that connects a vertex already in the MST to a vertex not yet in the MST.

In simple terms:

  1. Initialization: Start with a single vertex (arbitrary).
  2. Iteration: Find the smallest-weight edge that connects a vertex in the current MST to a vertex *not* in the current MST.
  3. Addition: Add that edge and the connected vertex to the MST.
  4. Repeat: Repeat steps 2 and 3 until all vertices are in the MST.

Detailed Explanation of Prim's Algorithm

Let G = (V, E) be a weighted, undirected graph, where V is the set of vertices and E is the set of edges, and each edge (u, v) has a weight w(u, v).

The algorithm can be described as follows:

  1. Initialize:
    • Create a set `MST_vertices` to store the vertices currently in the MST. Initially, `MST_vertices` contains only a single, arbitrarily chosen start vertex `s`.
    • Create a set `MST_edges` to store the edges in the MST. Initially, `MST_edges` is empty.
    • Create a data structure (typically a priority queue) `PQ` to store all the edges that connect a vertex in `MST_vertices` to a vertex *not* in `MST_vertices`. The priority queue is ordered by edge weight (smallest weight at the top).
  2. Loop: While `MST_vertices` does not contain all vertices in V:
    • Extract the edge (u, v) with the minimum weight from `PQ`.
    • Check if both `u` and `v` are already in `MST_vertices`. If so, this edge forms a cycle and should be discarded. Discarding cycle forming edges is handled implicitly by the priority queue.
    • If one of `u` or `v` is in `MST_vertices` and the other is not:
      • Add the edge (u, v) to `MST_edges`.
      • Add the vertex that is not in `MST_vertices` to `MST_vertices`.
      • Update the `PQ`. For each vertex `x` that is now adjacent to the newly added vertex, if `x` is NOT in `MST_vertices`, and the edge connecting it to the MST has a lower weight than the current edge, add it to `PQ` or update its priority in the PQ.
  3. Output: The set `MST_edges` represents the edges in the minimum spanning tree.

Implementation Details Using Priority Queue

The efficiency of Prim's algorithm heavily relies on the data structure used to manage the edges that connect the MST to the rest of the graph. A priority queue (min-heap) is the optimal choice.

Priority Queue Operations:

  • `insert(edge, weight)`: Adds a new edge and its weight to the priority queue. This takes O(log E) time, where E is the number of edges in the graph, in the worst case.
  • `extract_min()`: Removes and returns the edge with the smallest weight from the priority queue. This takes O(log E) time.
  • `decrease_key(edge, new_weight)` (or `update_priority`): If the weight of an edge already in the priority queue is reduced, this operation updates its position in the priority queue to maintain the min-heap property. This can be implemented as delete + insert, also taking O(log E) time. Some priority queue implementations offer a direct decrease_key, which can potentially be faster.

Pseudocode Implementation:

 function primMST(graph, start_vertex):
  MST_vertices = {start_vertex}
  MST_edges = {}
  PQ = PriorityQueue()  // Priority queue of edges, ordered by weight

  // Add all edges connected to the start vertex to the PQ
  for each neighbor v of start_vertex:
    PQ.insert((start_vertex, v), weight(start_vertex, v))

  while MST_vertices.size < graph.vertices.size:
    if PQ is empty:
      //Graph is disconnected, so no MST exists
      return null

    (u, v), weight = PQ.extract_min()

    // If both u and v are in MST_vertices, this edge forms a cycle, skip it
    if u in MST_vertices and v in MST_vertices:
      continue

    // Determine which vertex is NOT in MST_vertices
    new_vertex = null
    if u not in MST_vertices:
      new_vertex = u
    else:
      new_vertex = v

    // Add the edge and vertex to the MST
    MST_edges.add((u, v))
    MST_vertices.add(new_vertex)

    // Update PQ with new edges
    for each neighbor x of new_vertex:
      if x not in MST_vertices:
        new_edge = (new_vertex, x)
        new_weight = weight(new_vertex, x)

        // Check if an edge to x is already in PQ with a higher weight
        edge_in_pq = null
        weight_in_pq = infinity
        for (existing_edge, existing_weight) in PQ:
          if (existing_edge[0] == new_vertex and existing_edge[1] == x) or (existing_edge[0] == x and existing_edge[1] == new_vertex):
            edge_in_pq = existing_edge
            weight_in_pq = existing_weight
            break

        if weight_in_pq > new_weight:
          if edge_in_pq != null:
            PQ.remove(edge_in_pq)  //Remove old edge
          PQ.insert(new_edge, new_weight)


  return MST_edges 

Complexity Analysis:

  • Time Complexity: O(E log V) where E is the number of edges and V is the number of vertices. The `while` loop runs at most V-1 times. Inside the loop, `extract_min` and `insert` take O(log E) time. In a dense graph, E can be closer to V2, so this could be considered O(E log V) or O(V2 log V).
  • Space Complexity: O(V + E) - Space for the priority queue and adjacency lists representing the graph.

Correctness Proof

The correctness of Prim's algorithm can be proven using mathematical induction.

Induction Hypothesis: At each step of the algorithm, the set of edges selected so far (the set `MST_edges`) is a subset of some MST of the graph.

Base Case: Initially, `MST_edges` is empty, which is trivially a subset of any MST.

Inductive Step: Assume that the induction hypothesis holds true after *k* iterations. That is, the edges in `MST_edges` (after *k* iterations) are a subset of some MST, let's call it T. We need to show that after the (k+1)th iteration, the new set of edges is also a subset of *some* MST.

Let (u, v) be the edge added to `MST_edges` in the (k+1)th iteration. There are two possibilities:

  1. (u, v) is in T: In this case, the new set of edges is still a subset of T, so the induction hypothesis holds.
  2. (u, v) is NOT in T: If (u, v) is not in T, adding (u, v) to T will create a cycle. This cycle must contain another edge (x, y) such that one endpoint (x or y) is in `MST_vertices` and the other endpoint is not in `MST_vertices`. (This is because we chose (u,v) to connect the MST to an unvisited vertex). Moreover, since Prim's algorithm chose (u, v) instead of (x, y), it must be that w(u, v) ≤ w(x, y).
  3. Now, create a new tree T' by removing (x, y) from T and adding (u, v). T' is also a spanning tree because removing (x, y) and adding (u, v) breaks the cycle and maintains connectivity.

    The weight of T' is w(T') = w(T) - w(x, y) + w(u, v). Since w(u, v) ≤ w(x, y), w(T') ≤ w(T). Since T is a minimum spanning tree, w(T') must be equal to w(T), so T' is also a minimum spanning tree.

    Furthermore, the new set of edges in `MST_edges` (including (u, v)) is a subset of T'. Therefore, the induction hypothesis holds.

Conclusion: By the principle of mathematical induction, the set of edges in `MST_edges` is always a subset of some MST. Since the algorithm terminates when all vertices are connected, the resulting set of edges `MST_edges` forms a minimum spanning tree.