Editing Strongly connected component

{{Short description|Partition of a graph whose components are reachable from all vertices}}
[[Image:Scc-1.svg|thumb|Graph with strongly connected components marked]]
{{Graph connectivity sidebar}}
In the [[mathematics|mathematical]] theory of [[directed graph]]s, a graph is said to be '''strongly connected''' if every vertex is [[reachability|reachable]] from every other vertex. The '''strongly connected components''' of a directed graph form a [[partition of a set|partition]] into [[subgraph (graph theory)|subgraph]]s that are themselves strongly connected. It is possible to test the strong [[connectivity (graph theory)|connectivity]] of a graph, or to find its strongly connected components, in [[linear time]] (that is, Θ(''V''&nbsp;+&nbsp;''E'')).

==Definitions==
A [[directed graph]] is called '''strongly connected''' if there is a [[path (graph theory)|path]] in each direction between each pair of vertices of the graph. That is, a path exists from the first vertex in the pair to the second, and another path exists from the second vertex to the first.
In a directed graph ''G'' that may not itself be strongly connected, a pair of vertices ''u'' and ''v'' are said to be strongly connected to each other if there is a path in each direction between them.

The [[binary relation]] of being strongly connected is an [[equivalence relation]], and the [[induced subgraph]]s of its [[equivalence class]]es are called '''strongly connected components'''.
Equivalently, a '''strongly connected component''' of a directed graph ''G'' is a subgraph that is strongly connected, and is [[maximal element|maximal]] with this property: no additional edges or vertices from ''G'' can be included in the subgraph without breaking its property of being strongly connected. The collection of strongly connected components forms a partition of the set of vertices of ''G''. A strongly connected component ''C'' is called ''trivial'' when ''C'' consists of a single vertex which is not connected to itself with an edge, and ''non-trivial'' otherwise.<ref>{{citation
 | last1 = Nuutila | first1 = Esko
 | last2 = Soisalon-Soininen | first2 = Eljas
 | doi = 10.1016/0020-0190(94)90047-7
 | issue = 1
 | journal = Information Processing Letters
 | pages = 9–14
 | title = On finding the strongly connected components in a directed graph
 | volume = 49
 | year = 1994}}</ref>

[[File:Graph Condensation.svg|thumb|upright=1.5|The yellow [[directed acyclic graph]] is the condensation of the blue directed graph. It is formed by contracting each strongly connected component of the blue graph into a single yellow vertex.]]
If each strongly connected component is [[vertex contraction|contracted]] to a single vertex, the resulting graph is a [[directed acyclic graph]], the '''condensation''' of ''G''. A directed graph is acyclic if and only if it has no strongly connected subgraphs with more than one vertex, because a [[directed cycle]] is strongly connected and every non-trivial strongly connected component contains at least one directed cycle.

==Algorithms==
===DFS-based linear-time algorithms===
Several [[algorithm]]s based on [[depth-first search]] compute strongly connected components in linear time.
*[[Kosaraju's algorithm]] uses two passes of depth-first search. The first, in the original graph, is used to choose the order in which the outer loop of the second depth-first search tests vertices for having been visited already and [[recursion|recursively]] explores them if not. The second depth-first search is on the [[transpose graph]] of the original graph, and each recursive exploration finds a single new strongly connected component.<ref>[[Thomas H. Cormen]], [[Charles E. Leiserson]], [[Ronald L. Rivest]], and [[Clifford Stein]]. ''[[Introduction to Algorithms]]'', Second Edition. MIT Press and McGraw-Hill, 2001. {{isbn|0-262-03293-7}}. Section 22.5, pp.&nbsp;552&ndash;557.</ref><ref name="hong2013fast" /> It is named after [[S. Rao Kosaraju]], who described it (but did not publish his results) in 1978; [[Micha Sharir]] later published it in 1981.<ref>{{citation|doi=10.1016/0898-1221(81)90008-0|title=A strong-connectivity algorithm and its applications in data flow analysis|year=1981|last1=Sharir|first1=Micha|author-link1=Micha Sharir|journal=Computers & Mathematics with Applications|volume=7|pages=67–72|doi-access=free}}</ref>
*[[Tarjan's strongly connected components algorithm]], published by [[Robert Tarjan]] in 1972,<ref>{{citation|first=R. E.|last=Tarjan|author-link=Robert Tarjan|title=Depth-first search and linear graph algorithms|journal=[[SIAM Journal on Computing]]|volume=1|year=1972|issue=2|pages=146–160|doi=10.1137/0201010|s2cid=16467262 }}</ref> performs a single pass of depth-first search. It maintains a [[Stack (abstract data type)|stack]] of vertices that have been explored by the search but not yet assigned to a component, and calculates "low numbers" of each vertex (an index number of the highest ancestor reachable in one step from a descendant of the vertex) which it uses to determine when a set of vertices should be popped off the stack into a new component.
*The [[path-based strong component algorithm]] uses a depth-first search, like Tarjan's algorithm, but with two stacks. One of the stacks is used to keep track of the vertices not yet assigned to components, while the other keeps track of the current path in the depth-first search tree. The first linear time version of this algorithm was published by [[Edsger W. Dijkstra]] in 1976.<ref>{{citation
 | last = Dijkstra | first = Edsger | author-link = Edsger Dijkstra
 | location = NJ
 | at = Ch.&nbsp;25
 | publisher = Prentice Hall
 | title = A Discipline of Programming
 | year = 1976}}.</ref>
Although Kosaraju's algorithm is conceptually simple, Tarjan's and the path-based algorithm require only one depth-first search rather than two.

===Reachability-based algorithms===

Previous linear-time algorithms are based on depth-first search which is generally considered hard to parallelize.  Fleischer et al.<ref name="Fleischer">{{citation|doi=10.1007/3-540-45591-4_68|isbn=978-3-540-67442-9|chapter=On Identifying Strongly Connected Components in Parallel|title=Parallel and Distributed Processing|series=Lecture Notes in Computer Science|year=2000|last1=Fleischer|first1=Lisa K.|last2=Hendrickson|first2=Bruce|last3=Pınar|first3=Ali|volume=1800|pages=505–511|chapter-url=https://cfwebprod.sandia.gov/cfdocs/CompResearch/docs/irreg00.pdf}}</ref> in 2000 proposed a [[Divide and conquer algorithm|divide-and-conquer]] approach based on [[reachability]] queries, and such algorithms are usually called reachability-based SCC algorithms.  The idea of this approach is to pick a random pivot vertex and apply forward and backward reachability queries from this vertex.  The two queries partition the vertex set into 4 subsets: vertices reached by both, either one, or none of the searches.  One can show that a strongly connected component has to be contained in one of the subsets.  The vertex subset reached by both searches forms a strongly connected component, and the algorithm then recurses on the other 3 subsets.

The expected sequential running time of this algorithm is shown to be O(''n''&nbsp;log&thinsp;''n''), a factor of O(log&thinsp;''n'') more than the classic algorithms.  The parallelism comes from: (1) the reachability queries can be parallelized more easily (e.g. by a [[breadth-first search]] (BFS), and it can be fast if the [[diameter (graph theory)|diameter]] of the graph is small); and (2) the independence between the subtasks in the divide-and-conquer process.
This algorithm performs well on real-world graphs,<ref name="hong2013fast">{{citation|doi=10.1145/2503210.2503246|isbn=9781450323789|chapter=On fast parallel detection of strongly connected components (SCC) in small-world graphs|title=Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis - SC '13|year=2013|last1=Hong|first1=Sungpack|last2=Rodia|first2=Nicole C.|last3=Olukotun|first3=Kunle|pages=1–11|s2cid=2156324 |chapter-url=https://ppl.stanford.edu/papers/sc13-hong.pdf}}</ref> but does not have theoretical guarantee on the parallelism (consider if a graph has no edges, the algorithm requires O(''n'') levels of recursions).

Blelloch et al.<ref name="Parallel">{{citation|doi=10.1145/2935764.2935766|isbn=9781450342100|chapter=Parallelism in Randomized Incremental Algorithms|title=Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures - SPAA '16|year=2016|last1=Blelloch|first1=Guy E.|last2=Gu|first2=Yan|last3=Shun|first3=Julian|last4=Sun|first4=Yihan|pages=467–478|chapter-url=https://people.csail.mit.edu/guyan/paper/SPAA16/Incremental.pdf|doi-access=free|hdl=1721.1/146176|hdl-access=free}}.</ref> in 2016 shows that if the reachability queries are applied in a random order, the cost bound of O(''n''&nbsp;log&thinsp;''n'') still holds.  Furthermore, the queries then can be batched in a prefix-doubling manner (i.e. 1, 2, 4, 8 queries) and run simultaneously in one round.  The overall [[Analysis of parallel algorithms|span]] of this algorithm is log<sub>2</sub>&thinsp;''n'' reachability queries, which is probably the optimal parallelism that can be achieved using the reachability-based approach.

===Generating random strongly connected graphs===

Peter M. Maurer describes an algorithm for generating random strongly connected graphs,<ref>{{Citation|author=Maurer, P. M.|title=Generating strongly connected random graphs|publisher=CSREA Press|isbn=978-1-60132-465-8|series=Int'l Conf. Modeling, Sim. and Vis. Methods MSV'17|date=February 2018 |access-date=December 27, 2019|url=https://csce.ucmss.com/cr/books/2017/LFS/CSREA2017/MSV3359.pdf}}</ref> based on a modification of an algorithm for [[strong connectivity augmentation]], the problem of adding as few edges as possible to make a graph strongly connected.  When used in conjunction with the Gilbert or Erdős-Rényi models with node relabelling, the algorithm is capable of generating any strongly connected graph on ''n'' nodes, without restriction on the kinds of structures that can be generated.

==Applications==
Algorithms for finding strongly connected components may be used to solve [[2-satisfiability]] problems (systems of Boolean variables with constraints on the values of pairs of variables): as {{harvtxt|Aspvall|Plass|Tarjan|1979}} showed, a 2-satisfiability instance is unsatisfiable if and only if there is a variable ''v'' such that ''v'' and its negation are both contained in the same strongly connected component of the [[implication graph]] of the instance.<ref>{{citation
 | last1 = Aspvall | first1 = Bengt
 | last2 = Plass | first2 = Michael F.
 | author-link3 = Robert Tarjan | last3 = Tarjan | first3 = Robert E.
 | title = A linear-time algorithm for testing the truth of certain quantified boolean formulas
 | journal = Information Processing Letters
 | volume = 8 | issue = 3 | pages = 121–123 | year = 1979
 | doi = 10.1016/0020-0190(79)90002-4}}.</ref>

Strongly connected components are also used to compute the [[Dulmage–Mendelsohn decomposition]], a classification of the edges of a [[bipartite graph]], according to whether or not they can be part of a [[perfect matching]] in the graph.<ref>{{citation |title=Coverings of bipartite graphs |first1=A. L. |last1=Dulmage |author-link2=Nathan Mendelsohn |first2=N. S. |last2=Mendelsohn |name-list-style=amp |journal=Can. J. Math. |year=1958 |volume=10 |pages=517–534 |doi=10.4153/cjm-1958-052-0|s2cid=123363425 |doi-access=free }}.</ref>

==Related results==
A directed graph is strongly connected if and only if it has an [[ear decomposition]], a partition of the edges into a sequence of directed paths and cycles such that the first subgraph in the sequence is a cycle, and each subsequent subgraph is either a cycle sharing one vertex with previous subgraphs, or a path sharing its two endpoints with previous subgraphs.

According to [[Robbins' theorem]], an undirected graph may be [[graph orientation|oriented]] in such a way that it becomes strongly connected, if and only if it is [[k-edge-connected graph|2-edge-connected]]. One way to prove this result is to find an ear decomposition of the underlying undirected graph and then orient each ear consistently.<ref>{{citation
 | last = Robbins | first = H. E. | author-link = Herbert Robbins
 | journal = [[American Mathematical Monthly]]
 | jstor = 2303897
 | pages = 281–283
 | title = A theorem on graphs, with an application to a problem on traffic control
 | volume = 46
 | issue = 5 | year = 1939
 | doi=10.2307/2303897 }}.</ref>

==See also==
* [[Clique (graph theory)]]
* [[Connected component (graph theory)]]
* [[Modular decomposition]]
* [[Weak component]]

== References ==
{{reflist}}

== External links ==
* [http://code.google.com/p/jbpt/ Java implementation for computation of strongly connected components] in the jBPT library (see StronglyConnectedComponents class).
* [http://www.geeksforgeeks.org/tarjan-algorithm-find-strongly-connected-components/ C++ implementation of Strongly Connected Components]

[[Category:Graph connectivity]]
[[Category:Directed graphs]]