Evaluating Alternate Variations of Kosaraju Sharir Algorithm for Computing Strongly Connected Components

TL;DR In this blog post, we will explore examples illustrating why alternate variations of Kosaraju-Sharir’s algorithm for computing strongly connected components fail.

Prerequisites

I assume that reader is already familiar with the below topics:

Strongly connected component problem
Rough idea of Kosaraju Sharir Algorithm¹.

Finally, a handy reference to clrs book as we will be using the conceptual framework from this book.

Terminology

Discovery time of a vertex: When a vertex is first discovered. Denoted as v.discovery
finish time of a vertex: When the vertex is fully explored and recursive dfs function for the vertex returns. Denoted as v.finish

Conventions used:

An edge being explored is in bold red color and an edge in blue color is part of dfs tree
Time is monotonically increasing and it starts from 1
discovery_time/finish_time times appear below the node
Nodes are that fully processed are in blue color and currently being explored are in red color

Consider an example of running dfs from a vertex a

Algorithm

Sample Implementation

Run dfs on the given graph G to compute finish time v.finish for each vertex.
Compute the transpose \(G^T\) of the graph.
Process the vertices in the decreasing order of finish time as computed in step 1 and run dfs on \(G^T\). Dfs trees, thus generated are strongly connected components. When determining a tree, ignore the nodes that are already finished as part of previous trees.².

To understand why the above algorithm works, refer to clrs book as it does a better job than I could possibly ever do.

Alternate Versions

Do other variations in step 3 work? Like instead of choosing finish_time we choose discovery_time and run dfs in step 3 on original graph. In total, there are 8 possibilities (1 correct and 7 incorrect)³ .By using examples, we will see that remaining 7 options don’t work.

1. Discover time ascending order and Original graph

Here, we start dfs from node a because its discovery time is minimum and reach b via a->b edge and wrongly compute a,b must be belong to same component.

Wrongly computed components= [{a,b}]
Actual components= [{a}, {b}]

2. Discover time ascending order and Transpose graph

Here, we start dfs from node b because its discovery time is minumum

Wrongly computed components=[{b, a}]
Actual components= [{a}, {b}]

3. Discovery time descending order and Original graph

Here, we start dfs from node c because its discovery time is maximum.

Wrongly computed components=[{c,b}, {a}]
Actual components= [{a}, {b}, {c}]

4. Discovery time descending order and Transpose Graph

Here, we start dfs from node b because its discovery time is maximum.

Wrongly computed components=[{b,a}]
Actual components= [{a}, {b}]

5. Finish time ascending order and Original graph

Here, we start dfs from node c because its finish time is minimum.

Wrongly computed components=[{c,a,b}]
Actual components= [{a,c}, {b}]

6. Finish time ascending order and Transpose graph

Here, we start dfs from node b because its finish time is minimum

Wrongly computed components=[{b,a}]
Actual components= [{a}, {b}]

7. Finish time descending order and original graph

Here, we start dfs from node a because its finish time is maximum

Wrongly computed components=[{a,b}]
Actual components= [{a}, {b}]

8. Finish time descending order and transpose graph

This version is correct and refer to clrs for concrete proof.

The reason for not using wikipedia version of Kosaraju algorithm is because is I found it diffcult to understand. ↩
This is same as many online implementations adding the explored node at the beginning of queue and processing the queue from head. ↩
discovery or finish time = 2, original or transpose graph = 2, ascending or descending order = 2. Total 8 combinations. ↩