Greedy Algorithms
Introduction to greedy algorithms. Discusses the characteristics of problems suitable for greedy solutions. Covers examples like Dijkstra's algorithm, Kruskal's algorithm, and Huffman coding.
Dijkstra's Algorithm
Introduction
Dijkstra's algorithm is a classic algorithm in graph theory used to find the shortest paths from a single source node to all other nodes in a weighted graph, where edge weights are non-negative. It's a greedy algorithm, meaning it makes the locally optimal choice at each step with the hope of finding the global optimum (the shortest path). The algorithm efficiently calculates the shortest distances and the paths to reach each vertex from a designated starting point.
Detailed Explanation of Dijkstra's Algorithm
Core Concepts
At its heart, Dijkstra's algorithm maintains a set of visited nodes (nodes for which the shortest distance from the source is already known) and a priority queue (or similar data structure) to efficiently select the next node to visit. The priority queue is typically keyed by the current shortest distance estimate to each node.
Algorithm Steps
- Initialization:
- Assign a distance value to each vertex in the graph:
- Set the distance to the source vertex to 0.
- Set the distance to all other vertices to infinity (or a sufficiently large number representing infinity).
- Create a set `visited` to keep track of visited vertices, initially empty.
- Create a priority queue `pq` (e.g., a min-heap) and insert the source vertex into it with its distance (0). The priority queue is ordered by the distance values.
- Assign a distance value to each vertex in the graph:
- Iteration:
While the priority queue is not empty:
- Extract the vertex `u` with the smallest distance value from the priority queue. This is the current closest unvisited vertex.
- Mark `u` as visited by adding it to the `visited` set. If `u` is already in `visited`, continue to the next iteration (this prevents revisiting vertices).
- For each neighbor `v` of `u` (i.e., each vertex directly connected to `u` by an edge):
- Calculate the distance to `v` through `u`: `dist[u] + weight(u, v)`, where `weight(u, v)` is the weight of the edge connecting `u` and `v`.
- If this distance is shorter than the current distance to `v` (i.e., `dist[u] + weight(u, v) < dist[v]`):
- Update the distance to `v`: `dist[v] = dist[u] + weight(u, v)`.
- Update the priority queue: If `v` is already in the priority queue, decrease its priority (distance) to the new value. If `v` is not in the priority queue, insert it with its new distance.
- Termination:
Once the priority queue is empty, the algorithm has visited all reachable vertices from the source, and `dist[v]` contains the shortest distance from the source to each vertex `v`.
Pseudocode
function dijkstra(graph, source): // Initialize distances dist = {} for each vertex v in graph: dist[v] = infinity dist[source] = 0 // Initialize visited set visited = set() // Initialize priority queue pq = PriorityQueue() pq.insert(source, 0) // (vertex, distance) while pq is not empty: u = pq.extract_min() // Vertex with smallest distance if u in visited: continue visited.add(u) for each neighbor v of u: weight_uv = weight of edge (u, v) if dist[u] + weight_uv < dist[v]: dist[v] = dist[u] + weight_uv if v is in pq: pq.decrease_priority(v, dist[v]) // Update priority in pq else: pq.insert(v, dist[v]) return dist // Shortest distances from source to all vertices
Example
Graph Representation
Consider the following weighted graph represented as an adjacency list (or adjacency matrix):
Graph: A: {B: 4, C: 2} B: {A: 4, C: 5, D: 10} C: {A: 2, B: 5, E: 3} D: {B: 10, F: 11} E: {C: 3, D: 11, F: 2} F: {D: 11, E: 2}
Applying Dijkstra's Algorithm (Source: A)
Let's walk through the algorithm step-by-step, starting from vertex A.
- Initialization:
- dist[A] = 0
- dist[B] = infinity
- dist[C] = infinity
- dist[D] = infinity
- dist[E] = infinity
- dist[F] = infinity
- pq = {(A, 0)}
- visited = {}
- Iteration 1:
- u = A (extracted from pq)
- visited = {A}
- Neighbors of A: B (weight 4), C (weight 2)
- dist[B] = 4 (0 + 4 < infinity)
- dist[C] = 2 (0 + 2 < infinity)
- pq = {(C, 2), (B, 4)}
- Iteration 2:
- u = C (extracted from pq)
- visited = {A, C}
- Neighbors of C: A (weight 2), B (weight 5), E (weight 3)
- A is already visited.
- dist[B] = 4 (already shorter than 2 + 5 = 7)
- dist[E] = 5 (2 + 3 < infinity)
- pq = {(B, 4), (E, 5)}
[Continue this process for the remaining nodes. Illustrate the changes to `dist`, `pq`, and `visited` after each iteration.] Show the steps clearly until all nodes reachable from A have their shortest paths found.
Final Result
After completing the algorithm, the shortest distances from A to each vertex are:
dist[A] = 0 dist[B] = 4 dist[C] = 2 dist[D] = 12 dist[E] = 5 dist[F] = 7
Time Complexity Analysis
The time complexity of Dijkstra's algorithm depends on the data structure used for the priority queue.
- Using an Array or List (Naive Implementation):
- Finding the minimum distance vertex takes O(V) time, where V is the number of vertices.
- Updating the distances takes O(E) time in total, where E is the number of edges.
- Overall, the time complexity is O(V2 + E), which simplifies to O(V2) for dense graphs (where E is close to V2).
- Using a Binary Heap (Priority Queue):
- Extracting the minimum element takes O(log V) time.
- Decreasing the key (updating the priority) takes O(log V) time.
- In the worst case, each edge might cause a priority update.
- Overall, the time complexity is O((V + E) log V). In most practical scenarios, this simplifies to O(E log V) because typically, E > V.
- Using a Fibonacci Heap (Advanced):
- Fibonacci heaps provide amortized O(1) time for decreasing the key.
- The overall time complexity becomes O(E + V log V).
- While theoretically better, Fibonacci heaps are often more complex to implement and can have higher constant factors, making them less practical for smaller graphs.
In practice, using a binary heap (or similar priority queue implementation) is often the preferred choice due to its balance between performance and ease of implementation. For very large and sparse graphs, Fibonacci heaps *might* provide a performance advantage, but this requires careful consideration and benchmarking.