Kahn’s Algorithm: Understanding and Implementation

Kahn’s Algorithm, also known as Kahn’s topological sorting algorithm, is a fundamental graph theory algorithm used to find a topological ordering of directed acyclic graphs (DAGs). This algorithm is widely used in various computer science applications, including compilers, task scheduling, and dependency resolution. In this article, we will explore the concepts behind Kahn’s Algorithm, discuss its applications, and provide a step-by-step implementation in Python.

Understanding Kahn’s Algorithm

The main idea behind Kahn’s Algorithm is to iteratively remove nodes with zero in-degrees from the graph until all nodes are removed or it is determined that a cycle exists. Here are the key steps of Kahn’s Algo :

  • Initialize a queue to store nodes with zero in-degrees.
  • Initialize a list to store the topological ordering of nodes.
  • Calculate the in-degrees of all nodes in the graph.
  • Add all nodes with zero in-degrees to the queue.
  • While the queue is not empty:
    • Remove a node from the queue.
    • Add the node to the topological ordering list.
    • Decrement the in-degrees of adjacent nodes.
    • If an adjacent node has zero in-degrees, add it to the queue.

Applications of Kahn’s Algorithm

Kahn’s Algorithm has several practical applications in computer science and beyond:

  • Task Scheduling: It is used to schedule tasks with dependencies efficiently.
  • Compiler Construction: It helps in building the control flow graph for optimization and code generation phases.
  • Build Systems: Dependency resolution in build systems like Make or CMake.
  • Course Prerequisite Checking: Ensuring that students meet prerequisites before taking advanced courses.
  • Package Management: Resolving package dependencies in package managers like npm or pip.

Python Implementation

from collections import defaultdict, deque

def kahns_algorithm(graph):
    in_degree = defaultdict(int)
    top_order = []
    queue = deque()

    # Calculate in-degrees
    for u, v in graph:
        in_degree[v] += 1

    # Add nodes with zero in-degrees to the queue
    for node in in_degree:
        if in_degree[node] == 0:
            queue.append(node)

    while queue:
        u = queue.popleft()
        top_order.append(u)

        for v in graph:
            if u == v[0]:
                in_degree[v[1]] -= 1
                if in_degree[v[1]] == 0:
                    queue.append(v[1])

    # Check for cycles
    if len(top_order) != len(graph):
        return "Graph contains a cycle"
    else:
        return top_order

# Example usage
graph = [(0, 1), (0, 2), (1, 3), (2, 3)]
result = kahns_algorithm(graph)
print(result)
Kahn's Algorithm: Understanding and Implementation
Kahn’s Algo : Understanding and Implementation

Another example :

import numpy as np

def forward_algorithm(observations, A, B, pi):
    T = len(observations)
    N, M = B.shape
    alpha = np.zeros((T, N))

    # Initialize the alpha values for t = 0
    alpha[0] = pi * B[:, observations[0]]

    # Forward Pass
    for t in range(1, T):
        for j in range(N):
            alpha[t, j] = np.sum(alpha[t - 1] * A[:, j]) * B[j, observations[t]]

    return alpha

def backward_algorithm(observations, A, B):
    T = len(observations)
    N, M = B.shape
    beta = np.zeros((T, N))

    # Initialize the beta values for t = T - 1
    beta[T - 1] = 1

    # Backward Pass
    for t in range(T - 2, -1, -1):
        for i in range(N):
            beta[t, i] = np.sum(A[i, :] * B[:, observations[t + 1]] * beta[t + 1, :])

    return beta

def baum_welch(observations, N, M, max_iterations=100, tolerance=1e-4):
    T = len(observations)

    # Initialize random initial values for A, B, and pi
    A = np.random.rand(N, N)
    A /= A.sum(axis=1, keepdims=True)
    B = np.random.rand(N, M)
    B /= B.sum(axis=1, keepdims=True)
    pi = np.random.rand(N)
    pi /= pi.sum()

    prev_likelihood = -np.inf

    for iteration in range(max_iterations):
        # E-step: Compute forward and backward probabilities
        alpha = forward_algorithm(observations, A, B, pi)
        beta = backward_algorithm(observations, A, B)

        # Compute the likelihood of the observations
        likelihood = np.sum(alpha[-1])

        # Check for convergence
        if abs(likelihood - prev_likelihood) < tolerance:
            break

        prev_likelihood = likelihood

        # M-step: Update model parameters
        xi = np.zeros((T - 1, N, N))
        for t in range(T - 1):
            for i in range(N):
                for j in range(N):
                    xi[t, i, j] = (alpha[t, i] * A[i, j] * B[j, observations[t + 1]] * beta[t + 1, j]) / likelihood

        A = np.sum(xi, axis=0) / np.sum(alpha[:-1], axis=0)[:, np.newaxis]
        for k in range(M):
            mask = observations == k
            B[:, k] = np.sum(alpha[mask], axis=0) / np.sum(alpha, axis=0)

        pi = alpha[0] / np.sum(alpha[0])

    return A, B, pi

# Example usage
if __name__ == "__main__":
    # Define the HMM parameters
    N = 2  # Number of states
    M = 3  # Number of observation symbols
    T = 100  # Length of the observed sequence

    # Generate synthetic observations from a random HMM
    np.random.seed(0)
    true_A = np.array([[0.7, 0.3], [0.4, 0.6]])
    true_B = np.array([[0.1, 0.4, 0.5], [0.7, 0.2, 0.1]])
    true_pi = np.array([0.6, 0.4])

    observations = np.random.choice(M, size=T, p=true_B[0])
    states = [np.random.choice(N, p=true_pi)]
    for t in range(1, T):
        next_state = np.random.choice(N, p=true_A[states[-1]])
        states.append(next_state)

    # Run Baum-Welch to estimate the model parameters
    estimated_A, estimated_B, estimated_pi = baum_welch(observations, N, M)

    print("True Transition Matrix (A):")
    print(true_A)
    print("Estimated Transition Matrix (A):")
    print(estimated_A)

    print("\nTrue Emission Matrix (B):")
    print(true_B)
    print("Estimated Emission Matrix (B):")
    print(estimated_B)

    print("\nTrue Initial State Probabilities (pi):")
    print(true_pi)
    print("Estimated Initial State Probabilities (pi):")
    print(estimated_pi)

Links to Related Articles

If you want to explore more about graph theory and related algorithms, Wikipedia provides comprehensive resources on the subject. You can check out the Topological Sorting article for an in-depth discussion of topological ordering and algorithms like Kahn’s Algorithm. Additionally, the Directed Acyclic Graph (DAG) article offers insights into the properties and applications of directed acyclic graphs, which play a crucial role in understanding Kahn’s Algorithm.

Internal Links

Retour en haut