Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Minimum spanning tree
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Least-weight tree connecting graph vertices}} {{Use American English|date = April 2019}} [[File:Minimum spanning tree.svg|thumb|300px|right|A [[planar graph]] and its minimum spanning tree. Each edge is labeled with its weight, which here is roughly proportional to its length.]] A '''minimum spanning tree''' ('''MST''') or '''minimum weight spanning tree''' is a subset of the edges of a [[connected graph|connected]], edge-weighted undirected [[Graph (discrete mathematics)|graph]] that connects all the [[Vertex (graph theory)|vertices]] together, without any [[cycle (graph theory)|cycle]]s and with the minimum possible total edge weight.<ref name="Numpy and Scipy Documentation — Numpy and Scipy documentation">{{cite web | title=scipy.sparse.csgraph.minimum_spanning_tree - SciPy v1.7.1 Manual | website=Numpy and Scipy Documentation — Numpy and Scipy documentation | url=https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csgraph.minimum_spanning_tree.html | access-date=2021-12-10 | quote=A minimum spanning tree is a graph consisting of the subset of edges which together connect all connected nodes, while minimizing the total sum of weights on the edges.}}</ref> That is, it is a [[spanning tree]] whose sum of edge weights is as small as possible.<ref name="NetworkX 2.6.2 documentation">{{cite web | title=networkx.algorithms.tree.mst.minimum_spanning_edges | website=NetworkX 2.6.2 documentation | url=https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.tree.mst.minimum_spanning_edges.html | access-date=2021-12-13 | quote=A minimum spanning tree is a subgraph of the graph (a tree) with the minimum sum of edge weights. A spanning forest is a union of the spanning trees for each connected component of the graph.}}</ref> More generally, any edge-weighted undirected graph (not necessarily connected) has a '''minimum spanning forest''', which is a union of the minimum spanning trees for its [[connected component (graph theory)|connected components]]. There are many use cases for minimum spanning trees. One example is a telecommunications company trying to lay cable in a new neighborhood. If it is constrained to bury the cable only along certain paths (e.g. roads), then there would be a graph containing the points (e.g. houses) connected by those paths. Some of the paths might be more expensive, because they are longer, or require the cable to be buried deeper; these paths would be represented by edges with larger weights. Currency is an acceptable unit for edge weight – there is no requirement for edge lengths to obey normal rules of geometry such as the [[triangle inequality]]. A ''spanning tree'' for that graph would be a subset of those paths that has no cycles but still connects every house; there might be several spanning trees possible. A ''minimum spanning tree'' would be one with the lowest total cost, representing the least expensive path for laying the cable. ==Properties== ===Possible multiplicity=== If there are {{mvar|n}} vertices in the graph, then each spanning tree has {{math|''n'' − 1}} edges. [[File:Multiple minimum spanning trees.svg|thumb|This figure shows there may be more than one minimum spanning tree in a graph. In the figure, the two trees below the graph are two possibilities of minimum spanning tree of the given graph.]] There may be several minimum spanning trees of the same weight; in particular, if all the edge weights of a given graph are the same, then every spanning tree of that graph is minimum. ===Uniqueness=== ''If each edge has a distinct weight then there will be only one, unique minimum spanning tree''. This is true in many realistic situations, such as the telecommunications company example above, where it's unlikely any two paths have ''exactly'' the same cost. This generalizes to spanning forests as well. Proof: # [[Proof by contradiction|Assume the contrary]], that there are two different MSTs {{mvar|A}} and {{mvar|B}}. # Since {{mvar|A}} and {{mvar|B}} differ despite containing the same nodes, there is at least one edge that belongs to one but not the other. Among such edges, let {{math|''e''{{sub|1}}}} be the one with least weight; this choice is unique because the edge weights are all distinct. Without loss of generality, assume {{math|''e''{{sub|1}}}} is in {{mvar|A}}. # As {{mvar|B}} is an MST, {{math|{''e''{{sub|1}}} ∪ ''B''}} must contain a cycle {{mvar|C}} with {{math|''e''{{sub|1}}}}. # As a tree, {{mvar|A}} contains no cycles, therefore {{mvar|C}} must have an edge {{math|''e''{{sub|2}}}} that is not in {{mvar|A}}. # Since {{math|''e''{{sub|1}}}} was chosen as the unique lowest-weight edge among those belonging to exactly one of {{mvar|A}} and {{mvar|B}}, the weight of {{math|''e''{{sub|2}}}} must be greater than the weight of {{math|''e''{{sub|1}}}}. # As {{math|''e''{{sub|1}}}} and {{math|''e''{{sub|2}}}} are part of the cycle {{mvar|C}}, replacing {{math|''e''{{sub|2}}}} with {{math|''e''{{sub|1}}}} in {{mvar|B}} therefore yields a spanning tree with a smaller weight. # This contradicts the assumption that {{mvar|B}} is an MST. More generally, if the edge weights are not all distinct then only the (multi-)set of weights in minimum spanning trees is certain to be unique; it is the same for all minimum spanning trees.<ref>{{cite web|url=https://cs.stackexchange.com/q/2204 |title=Do the minimum spanning trees of a weighted graph have the same number of edges with a given weight?|website=cs.stackexchange.com|access-date=4 April 2018}}</ref> ===Minimum-cost subgraph=== If the weights are ''positive'', then a minimum spanning tree is, in fact, a minimum-cost [[Glossary of graph theory#Subgraphs|subgraph]] connecting all vertices, since if a subgraph contains a [[Path (graph theory)|cycle]], removing any edge along that cycle will decrease its cost and preserve connectivity. ===Cycle property=== ''For any cycle {{mvar|C}} in the graph, if the weight of an edge {{mvar|e}} of {{mvar|C}} is larger than any of the individual weights of all other edges of {{mvar|C}}, then this edge cannot belong to an MST.'' Proof: [[Proof by contradiction|Assume the contrary]], i.e. that {{mvar|e}} belongs to an MST {{math|''T''{{sub|1}}}}. Then deleting {{mvar|e}} will break {{math|''T''{{sub|1}}}} into two subtrees with the two ends of {{mvar|e}} in different subtrees. The remainder of {{mvar|C}} reconnects the subtrees, hence there is an edge {{mvar|f}} of {{mvar|C}} with ends in different subtrees, i.e., it reconnects the subtrees into a tree {{math|''T''{{sub|2}}}} with weight less than that of {{math|''T''{{sub|1}}}}, because the weight of {{mvar|f}} is less than the weight of {{mvar|e}}. ===Cut property=== [[File:Msp-the-cut-correct.svg|thumb|400px|This figure shows the cut property of MSTs. {{mvar|T}} is the only MST of the given graph. If {{math|1=''S'' = {''A'',''B'',''D'',''E''},}} thus {{math|1=''V'' – ''S'' = {''C'',''F''},}} then there are 3 possibilities of the edge across the cut {{math|(''S'', ''V'' – ''S'')}}, they are edges {{mvar|BC}}, {{mvar|EC}}, {{mvar|EF}} of the original graph. Then, e is one of the minimum-weight-edge for the cut, therefore {{math|''S'' ∪ {''e''} }} is part of the MST {{mvar|T}}.]] ''For any [[cut (graph theory)|cut]] {{mvar|C}} of the graph, if the weight of an edge {{mvar|e}} in the cut-set of {{mvar|C}} is strictly smaller than the weights of all other edges of the cut-set of {{mvar|C}}, then this edge belongs to all MSTs of the graph.'' Proof: [[Reductio ad absurdum|Assume]] that there is an MST {{mvar|T}} that does not contain {{mvar|e}}. Adding {{mvar|e}} to {{mvar|T}} will produce a cycle, that crosses the cut once at {{mvar|e}} and crosses back at another edge {{mvar|e'}}. Deleting {{mvar|e'}} we get a spanning tree {{math|''T''∖{''e' ''} ∪ {''e''} }} of strictly smaller weight than {{mvar|T}}. This contradicts the assumption that {{mvar|T}} was a MST. By a similar argument, if more than one edge is of minimum weight across a cut, then each such edge is contained in some minimum spanning tree. ===Minimum-cost edge=== ''If the minimum cost edge {{mvar|e}} of a graph is unique, then this edge is included in any MST.'' Proof: if {{mvar|e}} was not included in the MST, removing any of the (larger cost) edges in the cycle formed after adding {{mvar|e}} to the MST, would yield a spanning tree of smaller weight. ===Contraction=== If {{mvar|T}} is a tree of MST edges, then we can ''contract'' {{mvar|T}} into a single vertex while maintaining the invariant that the MST of the contracted graph plus {{mvar|T}} gives the MST for the graph before contraction.<ref name=PettieRamachandran2002/> ==Algorithms== In all of the algorithms below, {{mvar|m}} is the number of edges in the graph and {{mvar|n}} is the number of vertices. === Classic algorithms === The first algorithm for finding a minimum spanning tree was developed by Czech scientist [[Otakar Borůvka]] in 1926 (see [[Borůvka's algorithm]]). Its purpose was an efficient electrical coverage of [[Moravia]]. The algorithm proceeds in a sequence of stages. In each stage, called ''Boruvka step'', it identifies a forest {{mvar|F}} consisting of the minimum-weight edge incident to each vertex in the graph {{mvar|G}}, then forms the graph {{math|1=''G''{{sub|1}} = ''G'' \ ''F''}} as the input to the next step. Here {{math|''G'' \ ''F''}} denotes the graph derived from {{mvar|G}} by contracting edges in {{mvar|F}} (by the [[#Cut property|Cut property]], these edges belong to the MST). Each Boruvka step takes linear time. Since the number of vertices is reduced by at least half in each step, Boruvka's algorithm takes {{math|''O''(''m'' log ''n'')}} time.<ref name=PettieRamachandran2002/> A second algorithm is [[Prim's algorithm]], which was invented by [[Vojtěch Jarník]] in 1930 and rediscovered by [[Robert C. Prim|Prim]] in 1957 and [[Edsger W. Dijkstra|Dijkstra]] in 1959. Basically, it grows the MST ({{mvar|T}}) one edge at a time. Initially, {{mvar|T}} contains an arbitrary vertex. In each step, {{mvar|T}} is augmented with a least-weight edge {{math|(''x'',''y'')}} such that {{mvar|x}} is in {{mvar|T}} and {{mvar|y}} is not yet in {{mvar|T}}. By the [[#Cut property|Cut property]], all edges added to {{mvar|T}} are in the MST. Its run-time is either {{math|''O''(''m'' log ''n'')}} or {{math|''O''(''m'' + ''n'' log ''n'')}}, depending on the data-structures used. A third algorithm commonly in use is [[Kruskal's algorithm]], which also takes {{math|''O''(''m'' log ''n'')}} time. A fourth algorithm, not as commonly used, is the [[reverse-delete algorithm]], which is the reverse of Kruskal's algorithm. Its runtime is {{math|O(''m'' log ''n'' (log log ''n''){{sup|3}})}}. All four of these are [[greedy algorithm]]s. Since they run in polynomial time, the problem of finding such trees is in '''[[FP (complexity)|FP]]''', and related [[decision problem]]s such as determining whether a particular edge is in the MST or determining if the minimum total weight exceeds a certain value are in '''[[P (complexity)|P]]'''. === Faster algorithms === Several researchers have tried to find more computationally-efficient algorithms. In a comparison model, in which the only allowed operations on edge weights are pairwise comparisons, {{harvtxt|Karger|Klein|Tarjan|1995}} found a [[Expected linear time MST algorithm|linear time randomized algorithm]] based on a combination of Borůvka's algorithm and the reverse-delete algorithm.<ref>{{citation |last1=Karger |first1=David R. |title=A randomized linear-time algorithm to find minimum spanning trees |journal=[[Journal of the Association for Computing Machinery]] |volume=42 |issue=2 |pages=321–328 |year=1995 |doi=10.1145/201019.201022 |mr=1409738 |s2cid=832583 |last2=Klein |first2=Philip N. |last3=Tarjan |first3=Robert E. |author1-link=David Karger |author-link2=Philip N. Klein |author3-link=Robert Tarjan |doi-access=free}}</ref><ref>{{citation | last1 = Pettie | first1 = Seth | last2 = Ramachandran | first2 = Vijaya | author2-link = Vijaya Ramachandran | contribution = Minimizing randomness in minimum spanning tree, parallel connectivity, and set maxima algorithms | location = San Francisco, California | pages = 713–722 | title = Proc. 13th ACM-SIAM Symposium on Discrete Algorithms (SODA '02) | contribution-url = http://portal.acm.org/citation.cfm?id=545477 | year = 2002| isbn = 9780898715132 }}.</ref> The fastest non-randomized comparison-based algorithm with known complexity, by [[Bernard Chazelle]], is based on the [[soft heap]], an approximate priority queue.<ref name=Chazelle2000>{{citation | last = Chazelle | first = Bernard | author-link = Bernard Chazelle | doi = 10.1145/355541.355562 | mr = 1866456 | issue = 6 | journal = [[Journal of the Association for Computing Machinery]] | pages = 1028–1047 | title = A minimum spanning tree algorithm with inverse-Ackermann type complexity | volume = 47 | year = 2000| s2cid = 6276962 | doi-access = free }}.</ref><ref>{{citation | last = Chazelle | first = Bernard | author-link = Bernard Chazelle | doi = 10.1145/355541.355554 | mr = 1866455 | issue = 6 | journal = [[Journal of the Association for Computing Machinery]] | pages = 1012–1027 | title = The soft heap: an approximate priority queue with optimal error rate | volume = 47 | year = 2000| s2cid = 12556140| doi-access = free }}.</ref> Its running time is {{math|''[[Big O notation|O]]''(''m'' α(''m'',''n''))}}, where {{math|α}} is the classical functional [[Ackermann function#Inverse|inverse of the Ackermann function]]. The function {{math|α}} grows extremely slowly, so that for all practical purposes it may be considered a constant no greater than 4; thus Chazelle's algorithm takes very close to linear time. === Linear-time algorithms in special cases === ==== Dense graphs ==== If the graph is dense (i.e. {{math|''m''/''n'' ≥ log log log ''n'')}}, then a deterministic algorithm by Fredman and Tarjan finds the MST in time {{math|O(''m'')}}.<ref>{{Cite journal | doi = 10.1145/28869.28874| title = Fibonacci heaps and their uses in improved network optimization algorithms| journal = Journal of the ACM| volume = 34| issue = 3| pages = 596| year = 1987| last1 = Fredman | first1 = M. L. | last2 = Tarjan | first2 = R. E. | s2cid = 7904683| doi-access = free}}</ref> The algorithm executes a number of phases. Each phase executes [[Prim's algorithm]] many times, each for a limited number of steps. The run-time of each phase is {{math|O(''m'' + ''n'')}}. If the number of vertices before a phase is {{mvar|n'}}, the number of vertices remaining after a phase is at most <math>\tfrac{n'}{2^{m/n'}}</math>. Hence, at most {{math|log*''n''}} phases are needed, which gives a linear run-time for dense graphs.<ref name=PettieRamachandran2002/> There are other algorithms that work in linear time on dense graphs.<ref name=Chazelle2000/><ref>{{Cite journal | doi = 10.1007/bf02579168| title = Efficient algorithms for finding minimum spanning trees in undirected and directed graphs| journal = Combinatorica| volume = 6| issue = 2| pages = 109| year = 1986| last1 = Gabow | first1 = H. N. | author1-link = Harold N. Gabow | last2 = Galil | first2 = Z. | last3 = Spencer | first3 = T. | last4 = Tarjan | first4 = R. E. | s2cid = 35618095}}</ref> ==== Integer weights ==== If the edge weights are integers represented in binary, then deterministic algorithms are known that solve the problem in {{math|''O''(''m'' + ''n'')}} integer operations.<ref>{{citation | last1 = Fredman | first1 = M. L. | author1-link = Michael Fredman | last2 = Willard | first2 = D. E. | author2-link = Dan Willard | doi = 10.1016/S0022-0000(05)80064-9 | mr = 1279413 | issue = 3 | journal = [[Journal of Computer and System Sciences]] | pages = 533–551 | title = Trans-dichotomous algorithms for minimum spanning trees and shortest paths | volume = 48 | year = 1994| doi-access = free }}.</ref> Whether the problem can be solved ''deterministically'' for a ''general graph'' in ''linear time'' by a comparison-based algorithm remains an open question. === Decision trees === Given graph {{mvar|G}} where the nodes and edges are fixed but the weights are unknown, it is possible to construct a binary [[decision tree]] (DT) for calculating the MST for any permutation of weights. Each internal node of the DT contains a comparison between two edges, e.g. "Is the weight of the edge between {{mvar|x}} and {{mvar|y}} larger than the weight of the edge between {{mvar|w}} and {{mvar|z}}?". The two children of the node correspond to the two possible answers "yes" or "no". In each leaf of the DT, there is a list of edges from {{mvar|G}} that correspond to an MST. The runtime complexity of a DT is the largest number of queries required to find the MST, which is just the depth of the DT. A DT for a graph {{mvar|G}} is called ''optimal'' if it has the smallest depth of all correct DTs for {{mvar|G}}. For every integer {{mvar|r}}, it is possible to find optimal decision trees for all graphs on {{mvar|r}} vertices by [[brute-force search]]. This search proceeds in two steps. '''A. Generating all potential DTs''' * There are <math>2^{r \choose 2}</math> different graphs on {{mvar|r}} vertices. * For each graph, an MST can always be found using {{math|''r''(''r'' – 1)}} comparisons, e.g. by [[Prim's algorithm]]. * Hence, the depth of an optimal DT is less than {{math|''r''{{sup|2}}}}. * Hence, the number of internal nodes in an optimal DT is less than <math>2^{r^2}</math>. * Every internal node compares two edges. The number of edges is at most {{math|''r''{{sup|2}}}} so the different number of comparisons is at most {{math|''r''{{sup|4}}}}. * Hence, the number of potential DTs is less than <math>{(r^4)}^{(2^{r^2})} = r^{2^{(r^2+2)}}.</math> '''B. Identifying the correct DTs''' To check if a DT is correct, it should be checked on all possible permutations of the edge weights. * The number of such permutations is at most {{math|(''r''{{sup|2}})!}}. * For each permutation, solve the MST problem on the given graph using any existing algorithm, and compare the result to the answer given by the DT. * The running time of any MST algorithm is at most {{math|''r''{{sup|2}}}}, so the total time required to check all permutations is at most {{math|(''r''{{sup|2}} + 1)!}}. Hence, the total time required for finding an optimal DT for ''all'' graphs with {{mvar|r}} vertices is:<ref name=PettieRamachandran2002/> :<math>2^{r \choose 2} \cdot r^{2^{(r^2+2)}} \cdot (r^2+1)!,</math> which is less than :<math>2^{2^{r^2+o(r)}}.</math> {{See also|Decision tree model}} === Optimal algorithm === [[Seth Pettie]] and [[Vijaya Ramachandran]] have found a {{not a typo|provably}} optimal deterministic comparison-based minimum spanning tree algorithm.<ref name=PettieRamachandran2002>{{citation | last1 = Pettie | first1 = Seth | last2 = Ramachandran | first2 = Vijaya | doi = 10.1145/505241.505243 | mr = 2148431 | issue = 1 | journal = [[Journal of the Association for Computing Machinery]] | pages = 16–34 | title = An optimal minimum spanning tree algorithm | url = https://web.eecs.umich.edu/~pettie/papers/jacm-optmsf.pdf | volume = 49 | year = 2002| s2cid = 5362916 }}.</ref> The following is a simplified description of the algorithm. # Let {{math|1=''r'' = log log log ''n''}}, where {{mvar|n}} is the number of vertices. Find all optimal decision trees on {{mvar|r}} vertices. This can be done in time {{math|''O''(''n'')}} (see [[#Decision trees|Decision trees]] above). # Partition the graph to components with at most {{mvar|r}} vertices in each component. This partition uses a [[soft heap]], which "corrupts" a small number of the edges of the graph. # Use the optimal decision trees to find an MST for the uncorrupted subgraph within each component. # Contract each connected component spanned by the MSTs to a single vertex, and apply any algorithm which works on [[#Dense graphs|dense graphs]] in time {{math|''O''(''m'')}} to the contraction of the uncorrupted subgraph # Add back the corrupted edges to the resulting forest to form a subgraph guaranteed to contain the minimum spanning tree, and smaller by a constant factor than the starting graph. Apply the optimal algorithm recursively to this graph. The runtime of all steps in the algorithm is {{math|''O''(''m'')}}, ''except for the step of using the decision trees''. The runtime of this step is unknown, but it has been proved that it is optimal - no algorithm can do better than the optimal decision tree. Thus, this algorithm has the peculiar property that it is ''{{not a typo|provably}} optimal'' although its runtime complexity is ''unknown''. === Parallel and distributed algorithms === {{further|Parallel algorithms for minimum spanning trees}} Research has also considered [[parallel algorithm]]s for the minimum spanning tree problem. With a linear number of processors it is possible to solve the problem in {{math|''O''(log ''n'')}} time.<ref>{{citation | last1 = Chong | first1 = Ka Wong | last2 = Han | first2 = Yijie | last3 = Lam | first3 = Tak Wah | doi = 10.1145/375827.375847 | mr = 1868718 | issue = 2 | journal = [[Journal of the Association for Computing Machinery]] | pages = 297–323 | title = Concurrent threads and optimal parallel minimum spanning trees algorithm | volume = 48 | year = 2001| s2cid = 1778676 }}.</ref><ref>{{citation | last1 = Pettie | first1 = Seth | last2 = Ramachandran | first2 = Vijaya | doi = 10.1137/S0097539700371065 | mr = 1954882 | issue = 6 | journal = [[SIAM Journal on Computing]] | pages = 1879–1895 | title = A randomized time-work optimal parallel algorithm for finding a minimum spanning forest | volume = 31 | year = 2002| url = http://www.eecs.umich.edu/~pettie/papers/sicomp-randmst.pdf }}.</ref> The problem can also be approached in a [[distributed computing|distributed manner]]. If each node is considered a computer and no node knows anything except its own connected links, one can still calculate the [[distributed minimum spanning tree]]. ==MST on complete graphs with random weights== [[Alan M. Frieze]] showed that given a [[complete graph]] on ''n'' vertices, with edge weights that are independent identically distributed random variables with distribution function <math>F</math> satisfying <math>F'(0) > 0</math>, then as ''n'' approaches [[Extended real number line|+∞]] the expected weight of the MST approaches <math>\zeta(3)/F'(0)</math>, where <math>\zeta</math> is the [[Riemann zeta function]] (more specifically is <math>\zeta(3)</math> [[Apéry's constant]]). Frieze and [[J. Michael Steele|Steele]] also proved convergence in probability. [[Svante Janson]] proved a [[central limit theorem]] for weight of the MST. For uniform random weights in <math>[0,1]</math>, the exact expected size of the minimum spanning tree has been computed for small complete graphs.<ref>{{citation | last = Steele | first = J. Michael | author-link = J. Michael Steele | contribution = Minimal spanning trees for graphs with random edge lengths | mr = 1940139 | location = Basel | pages = 223–245 | publisher = Birkhäuser | series = Trends Math. | title = Mathematics and computer science, II (Versailles, 2002) | year = 2002}}</ref> {| class="wikitable" |- !Vertices !Expected size !Approximate expected size |- |2 |{{center|{{sfrac|1|2}}}} |0.5 |- |3 |{{center|{{sfrac|3|4}}}} |0.75 |- |4 |{{center|{{sfrac|31|35}}}} |0.8857143 |- |5 |{{center|{{sfrac|893|924}}}} |0.9664502 |- |6 |{{center|{{sfrac|278|273}}}} |1.0183151 |- |7 |{{center|{{sfrac|30739|29172}}}} |1.053716 |- |8 |{{center|{{sfrac|199462271|184848378}}}} |1.0790588 |- |9 |{{center|{{sfrac|126510063932|115228853025}}}} |1.0979027 |} == Fractional variant{{Anchor|fractional}} == There is a fractional variant of the MST, in which each edge is allowed to appear "fractionally". Formally, a '''fractional spanning set''' of a graph (V,E) is a nonnegative function ''f'' on ''E'' such that, for every non-trivial subset ''W'' of ''V'' (i.e., ''W'' is neither empty nor equal to ''V''), the sum of ''f''(''e'') over all edges connecting a node of ''W'' with a node of ''V''\''W'' is at least 1. Intuitively, ''f''(''e'') represents the fraction of e that is contained in the spanning set. A '''minimum fractional spanning set''' is a fractional spanning set for which the sum <math>\sum_{e\in E} f(e)\cdot w(e)</math> is as small as possible. If the fractions ''f''(''e'') are forced to be in {0,1}, then the set ''T'' of edges with f(e)=1 are a spanning set, as every node or subset of nodes is connected to the rest of the graph by at least one edge of ''T''. Moreover, if ''f'' minimizes<math>\sum_{e\in E} f(e)\cdot w(e)</math>, then the resulting spanning set is necessarily a tree, since if it contained a cycle, then an edge could be removed without affecting the spanning condition. So the minimum fractional spanning set problem is a relaxation of the MST problem, and can also be called the '''fractional MST problem.''' The fractional MST problem can be solved in polynomial time using the [[ellipsoid method]].<ref name=":1">{{Cite Geometric Algorithms and Combinatorial Optimization}}</ref>{{Rp|page=248}} However, if we add a requirement that ''f''(''e'') must be half-integer (that is, ''f''(''e'') must be in {0, 1/2, 1}), then the problem becomes [[NP-hard]],<ref name=":1" />{{Rp|page=248}} since it includes as a special case the [[Hamiltonian cycle problem]]: in an <math>n</math>-vertex unweighted graph, a half-integer MST of weight <math>n/2</math> can only be obtained by assigning weight 1/2 to each edge of a Hamiltonian cycle. ==Other variants== {{regular_polygon_minimum_spanning_tree.svg}} * The '''[[Steiner tree]]''' of a subset of the vertices is the minimum tree that spans the given subset. Finding the Steiner tree is [[NP-complete]].<ref>{{Garey-Johnson}}. ND12</ref> * The '''[[k-minimum spanning tree|''k''-minimum spanning tree]] (''k''-MST)''' is the tree that spans some subset of ''k'' vertices in the graph with minimum weight. * A set of '''k-smallest spanning trees''' is a subset of ''k'' spanning trees (out of all possible spanning trees) such that no spanning tree outside the subset has smaller weight.<ref>{{citation | last = Gabow | first = Harold N. | author-link = Harold N. Gabow | mr = 0441784 | issue = 1 | journal = [[SIAM Journal on Computing]] | pages = 139–150 | title = Two algorithms for generating weighted spanning trees in order | volume = 6 | year = 1977 | doi=10.1137/0206011}}.</ref><ref>{{citation | last = Eppstein | first = David | author-link = David Eppstein | doi = 10.1007/BF01994879 | mr = 1172188 | issue = 2 | journal = BIT | pages = 237–248 | title = Finding the ''k'' smallest spanning trees | volume = 32 | year = 1992| s2cid = 121160520 }}.</ref><ref>{{citation | last = Frederickson | first = Greg N. | doi = 10.1137/S0097539792226825 | mr = 1438526 | issue = 2 | journal = [[SIAM Journal on Computing]] | pages = 484–538 | title = Ambivalent data structures for dynamic 2-edge-connectivity and ''k'' smallest spanning trees | volume = 26 | year = 1997| url = http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1888&context=cstech }}.</ref> (Note that this problem is unrelated to the ''k''-minimum spanning tree.) * The '''[[Euclidean minimum spanning tree]]''' is a spanning tree of a graph with edge weights corresponding to the Euclidean distance between vertices which are points in the plane (or space). * The '''[[rectilinear minimum spanning tree]]''' is a spanning tree of a graph with edge weights corresponding to the [[rectilinear distance]] between vertices which are points in the plane (or space). * The '''[[distributed minimum spanning tree]]''' is an extension of MST to the [[distributed computing|distributed model]], where each node is considered a computer and no node knows anything except its own connected links. The mathematical definition of the problem is the same but there are different approaches for a solution. * The '''[[capacitated minimum spanning tree]]''' is a tree that has a marked node (origin, or root) and each of the subtrees attached to the node contains no more than ''c'' nodes. ''c'' is called a tree capacity. Solving CMST optimally is [[NP-hard]],<ref>{{citation | last1 = Jothi | first1 = Raja | last2 = Raghavachari | first2 = Balaji | title = Approximation Algorithms for the Capacitated Minimum Spanning Tree Problem and Its Variants in Network Design | journal = ACM Trans. Algorithms | volume = 1 | issue = 2 | pages = 265–282 | year = 2005 | doi=10.1145/1103963.1103967 | s2cid = 8302085 }}</ref> but good heuristics such as Esau-Williams and Sharma produce solutions close to optimal in polynomial time. * The [[degree-constrained spanning tree|'''degree-constrained minimum spanning tree''']] is a MST in which each vertex is connected to no more than ''d'' other vertices, for some given number ''d''. The case ''d'' = 2 is a special case of the [[traveling salesman problem]], so the degree constrained minimum spanning tree is [[NP-hard]] in general. * An [[Arborescence (graph theory)|'''arborescence''']] is a variant of MST for [[directed graph]]s. It can be solved in <math>O(E + V \log V)</math> time using the [[Chu–Liu/Edmonds algorithm]]. * A '''maximum spanning tree''' is a spanning tree with weight greater than or equal to the weight of every other spanning tree. Such a tree can be found with algorithms such as Prim's or Kruskal's after multiplying the edge weights by -1 and solving the MST problem on the new graph. A path in the maximum spanning tree is the [[widest path problem|widest path]] in the graph between its two endpoints: among all possible paths, it maximizes the weight of the minimum-weight edge.<ref>{{citation | last = Hu | first = T. C. | author-link = T. C. Hu | issue = 6 | journal = Operations Research | pages = 898–900 | title = The maximum capacity route problem | volume = 9 | year = 1961 | jstor=167055 | doi=10.1287/opre.9.6.898| doi-access = }}.</ref> Maximum spanning trees find applications in [[parsing]] algorithms for [[Natural language processing|natural languages]]<ref>{{cite conference | last1 = McDonald | first1 = Ryan | last2 = Pereira | first2 = Fernando | last3 = Ribarov | first3 = Kiril | last4 = Hajič | first4 = Jan | title = Non-projective dependency parsing using spanning tree algorithms | book-title = Proc. HLT/EMNLP | year = 2005 | url = http://www.seas.upenn.edu/~strctlrn/bib/PDF/nonprojectiveHLT-EMNLP2005.pdf}}</ref> and in training algorithms for [[conditional random field]]s. * The '''dynamic MST''' problem concerns the update of a previously computed MST after an edge weight change in the original graph or the insertion/deletion of a vertex.<ref>{{citation | last1 = Spira | first1 = P. M. | last2 = Pan | first2 = A. | mr = 0378466 | issue = 3 | journal = SIAM Journal on Computing | pages = 375–380 | title = On finding and updating spanning trees and shortest paths | volume = 4 | year = 1975 | doi=10.1137/0204032| url = http://www.ics.forth.gr/~lourakis/10.1137.0204032.pdf }}.</ref><ref>{{citation | last1 = Holm | first1 = Jacob | last2 = de Lichtenberg | first2 = Kristian | last3 = Thorup | first3 = Mikkel | author3-link = Mikkel Thorup | doi = 10.1145/502090.502095 | mr = 2144928 | issue = 4 | journal = [[Journal of the Association for Computing Machinery]] | pages = 723–760 | title = Poly-logarithmic deterministic fully dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity | volume = 48 | year = 2001| s2cid = 7273552 }}.</ref><ref>{{citation | last1 = Chin | first1 = F. | last2 = Houck | first2 = D. | journal = [[Journal of Computer and System Sciences]] | volume = 16 | issue = 3 | pages = 333–344 | title = Algorithms for updating minimal spanning trees | year = 1978 | doi=10.1016/0022-0000(78)90022-3| doi-access = }}.</ref> * The '''minimum labeling spanning tree problem''' is to find a spanning tree with least types of labels if each edge in a graph is associated with a label from a finite label set instead of a weight.<ref>{{citation | last1 = Chang | first1 = R.S. | last2 = Leu | first2 = S.J. | journal = [[Information Processing Letters]] | volume = 63 | issue = 5 | pages = 277–282 | title = The minimum labeling spanning trees | year = 1997 | doi=10.1016/s0020-0190(97)00127-0}}.</ref> * A '''bottleneck edge''' is the highest weighted edge in a spanning tree. A spanning tree is a '''[[minimum bottleneck spanning tree]]''' (or '''MBST''') if the graph does not contain a spanning tree with a smaller bottleneck edge weight. A MST is necessarily a MBST ({{not a typo|provable}} by the [[#Cut property|cut property]]), but a MBST is not necessarily a MST.<ref>{{cite web|url=http://flashing-thoughts.blogspot.ru/2010/06/everything-about-bottleneck-spanning.html|title=Everything about Bottleneck Spanning Tree|website=flashing-thoughts.blogspot.ru|date=5 June 2010 |access-date=4 April 2018}}</ref><ref>{{Cite web |url=http://pages.cpsc.ucalgary.ca/~dcatalin/413/t4.pdf |title=Archived copy |access-date=2014-07-02 |archive-date=2013-06-12 |archive-url=https://web.archive.org/web/20130612080859/http://pages.cpsc.ucalgary.ca/~dcatalin/413/t4.pdf |url-status=dead }}</ref> * A '''[[minimum-cost spanning tree game]]''' is a cooperative game in which the players have to share among them the costs of constructing the optimal spanning tree. * The '''[[optimal network design]]''' problem is the problem of computing a set, subject to a budget constraint, which contains a spanning tree, such that the sum of shortest paths between every pair of nodes is as small as possible. ==Applications== Minimum spanning trees have direct applications in the design of networks, including [[computer network]]s, [[telecommunications network]]s, [[transport network|transportation network]]s, [[water supply network]]s, and [[electrical grid]]s (which they were first invented for, as mentioned above).<ref>{{citation |last1=Graham |first1=R. L. |title=On the history of the minimum spanning tree problem |journal=Annals of the History of Computing |volume=7 |issue=1 |pages=43–57 |year=1985 |doi=10.1109/MAHC.1985.10011 |mr=783327 |s2cid=10555375 |last2=Hell |first2=Pavol |author1-link=Ronald Graham |author2-link=Pavol Hell}}</ref> They are invoked as subroutines in algorithms for other problems, including the [[Christofides algorithm]] for approximating the [[traveling salesman problem]],<ref>[[Nicos Christofides]], [https://apps.dtic.mil/dtic/tr/fulltext/u2/a025602.pdf Worst-case analysis of a new heuristic for the travelling salesman problem], Report 388, Graduate School of Industrial Administration, CMU, 1976.</ref> approximating the multi-terminal minimum cut problem (which is equivalent in the single-terminal case to the [[maximum flow problem]]),<ref>{{cite journal |last1=Dahlhaus |first1=E. |last2=Johnson |first2=D. S. |author2-link=David S. Johnson |last3=Papadimitriou |first3=C. H. |author3-link=Christos Papadimitriou |last4=Seymour |first4=P. D. |author4-link=Paul Seymour (mathematician) |last5=Yannakakis |first5=M. |author5-link=Mihalis Yannakakis |date=August 1994 |title=The complexity of multiterminal cuts |url=http://akpublic.research.att.com/~dsj/papers/3way.pdf |url-status=dead |journal=[[SIAM Journal on Computing]] |volume=23 |issue=4 |pages=864–894 |doi=10.1137/S0097539792225297 |archive-url=https://web.archive.org/web/20040824184059/http://akpublic.research.att.com/~dsj/papers/3way.pdf |archive-date=24 August 2004 |access-date=17 December 2012}}</ref> and approximating the minimum-cost weighted perfect [[matching (graph theory)|matching]].<ref>{{cite conference |last1=Supowit |first1=Kenneth J. |last2=Plaisted |first2=David A. |last3=Reingold |first3=Edward M. |year=1980 |title=Heuristics for weighted perfect matching |url=http://dl.acm.org/citation.cfm?id=804689 |conference=12th Annual ACM Symposium on Theory of Computing (STOC '80) |location=New York, NY, USA |publisher=ACM |pages=398–419 |doi=10.1145/800141.804689}}</ref> Other practical applications based on minimal spanning trees include: * [[Taxonomy (general)|Taxonomy]].<ref>{{cite journal |last=Sneath |first=P. H. A. |date=1 August 1957 |title=The Application of Computers to Taxonomy |journal=Journal of General Microbiology |volume=17 |issue=1 |pages=201–226 |doi=10.1099/00221287-17-1-201 |pmid=13475686 |doi-access=free}}</ref> * [[Cluster analysis]]: clustering points in the plane,<ref>{{cite conference |last1=Asano |first1=T. |author1-link=Tetsuo Asano |last2=Bhattacharya |first2=B. |last3=Keil |first3=M. |last4=Yao |first4=F. |author4-link=Frances Yao |year=1988 |title=Clustering algorithms based on minimum and maximum spanning trees |conference=Fourth Annual Symposium on Computational Geometry (SCG '88) |volume=1 |pages=252–257 |doi=10.1145/73393.73419}}</ref> [[single-linkage clustering]] (a method of [[hierarchical clustering]]),<ref>{{cite journal |last1=Gower |first1=J. C. |last2=Ross |first2=G. J. S. |year=1969 |title=Minimum Spanning Trees and Single Linkage Cluster Analysis |journal=Journal of the Royal Statistical Society |series=C (Applied Statistics) |volume=18 |issue=1 |pages=54–64 |doi=10.2307/2346439 |jstor=2346439}}</ref> graph-theoretic clustering,<ref>{{cite journal |last=Päivinen |first=Niina |date=1 May 2005 |title=Clustering with a minimum spanning tree of scale-free-like structure |journal=Pattern Recognition Letters |volume=26 |issue=7 |pages=921–930 |bibcode=2005PaReL..26..921P |doi=10.1016/j.patrec.2004.09.039}}</ref> and clustering [[gene expression]] data.<ref>{{cite journal |last1=Xu |first1=Y. |last2=Olman |first2=V. |last3=Xu |first3=D. |date=1 April 2002 |title=Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees |journal=Bioinformatics |volume=18 |issue=4 |pages=536–545 |doi=10.1093/bioinformatics/18.4.536 |pmid=12016051 |doi-access=free}}</ref> * Constructing trees for [[broadcasting (networking)|broadcasting]] in computer networks.<ref>{{cite journal |last1=Dalal |first1=Yogen K. |last2=Metcalfe |first2=Robert M. |date=1 December 1978 |title=Reverse path forwarding of broadcast packets |journal=Communications of the ACM |volume=21 |issue=12 |pages=1040–1048 |doi=10.1145/359657.359665 |s2cid=5638057 |doi-access=free}}</ref> * [[Image registration]]<ref>{{cite conference |last1=Ma |first1=B. |last2=Hero |first2=A. |last3=Gorman |first3=J. |last4=Michel |first4=O. |year=2000 |title=Image registration with minimum spanning tree algorithm |url=http://web.eecs.umich.edu/~hero/Preprints/MinimumSpanningTree.pdf |conference=International Conference on Image Processing |volume=1 |pages=481–484 |doi=10.1109/ICIP.2000.901000 |archive-url=https://ghostarchive.org/archive/20221009/http://web.eecs.umich.edu/~hero/Preprints/MinimumSpanningTree.pdf |archive-date=2022-10-09 |url-status=live}}</ref> and [[Image segmentation|segmentation]]<ref>P. Felzenszwalb, D. Huttenlocher: Efficient Graph-Based Image Segmentation. IJCV 59(2) (September 2004)</ref> – see [[minimum spanning tree-based segmentation]]. * Curvilinear [[feature extraction]] in [[computer vision]].<ref>{{cite journal |last1=Suk |first1=Minsoo |last2=Song |first2=Ohyoung |date=1 June 1984 |title=Curvilinear feature extraction using minimum spanning trees |journal=Computer Vision, Graphics, and Image Processing |volume=26 |issue=3 |pages=400–411 |doi=10.1016/0734-189X(84)90221-4}}</ref> * [[Handwriting recognition]] of mathematical expressions.<ref>{{cite book |last1=Tapia |first1=Ernesto |title=Graphics Recognition. Recent Advances and Perspectives |last2=Rojas |first2=Raúl |publisher=Springer-Verlag |year=2004 |isbn=978-3540224785 |series=Lecture Notes in Computer Science |volume=3088 |location=Berlin Heidelberg |pages=329–340 |chapter=Recognition of On-line Handwritten Mathematical Expressions Using a Minimum Spanning Tree Construction and Symbol Dominance |chapter-url=http://page.mi.fu-berlin.de/rojas/2003/grec03.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://page.mi.fu-berlin.de/rojas/2003/grec03.pdf |archive-date=2022-10-09 |url-status=live}}</ref> * [[Circuit design]]: implementing efficient multiple constant multiplications, as used in [[finite impulse response]] filters.<ref>{{cite conference |last1=Ohlsson |first1=H. |year=2004 |title=Implementation of low complexity FIR filters using a minimum spanning tree |conference=12th IEEE Mediterranean Electrotechnical Conference (MELECON 2004) |volume=1 |pages=261–264 |doi=10.1109/MELCON.2004.1346826}}</ref> * [[Regionalisation]] of socio-geographic areas, the grouping of areas into homogeneous, contiguous regions.<ref>{{cite journal |last=Assunção |first=R. M. |author2=M. C. Neves |author3=G. Câmara |author4=C. Da Costa Freitas |year=2006 |title=Efficient regionalization techniques for socio-economic geographical units using minimum spanning trees |url=https://zenodo.org/record/3832352 |journal=International Journal of Geographical Information Science |volume=20 |issue=7 |pages=797–811 |doi=10.1080/13658810600665111 |bibcode=2006IJGIS..20..797A |s2cid=2530748}}</ref> * Comparing [[ecotoxicology]] data.<ref>{{cite journal |last1=Devillers |first1=J. |last2=Dore |first2=J.C. |date=1 April 1989 |title=Heuristic potency of the minimum spanning tree (MST) method in toxicology |journal=Ecotoxicology and Environmental Safety |volume=17 |issue=2 |pages=227–235 |doi=10.1016/0147-6513(89)90042-0 |pmid=2737116|bibcode=1989EcoES..17..227D }}</ref> * Topological [[observability]] in power systems.<ref>{{cite journal |last1=Mori |first1=H. |last2=Tsuzuki |first2=S. |date=1 May 1991 |title=A fast method for topological observability analysis using a minimum spanning tree technique |journal=IEEE Transactions on Power Systems |volume=6 |issue=2 |pages=491–500 |bibcode=1991ITPSy...6..491M |doi=10.1109/59.76691}}</ref> * Measuring homogeneity of two-dimensional materials.<ref>{{cite journal |last1=Filliben |first1=James J. |last2=Kafadar |first2=Karen |author2-link=Karen Kafadar |last3=Shier |first3=Douglas R. |date=1 January 1983 |title=Testing for homogeneity of two-dimensional surfaces |journal=Mathematical Modelling |volume=4 |issue=2 |pages=167–189 |doi=10.1016/0270-0255(83)90026-X |doi-access=}}</ref> * Minimax [[process control]].<ref>{{citation |last1=Kalaba |first1=Robert E. |title=Graph Theory and Automatic Control |year=1963 |url=http://www.dtic.mil/dtic/tr/fulltext/u2/297072.pdf |archive-url=https://web.archive.org/web/20160221191747/http://www.dtic.mil/dtic/tr/fulltext/u2/297072.pdf |archive-date=February 21, 2016 |url-status=dead}}</ref> * Minimum spanning trees can also be used to describe financial markets.<ref>Mantegna, R. N. (1999). [[arxiv:cond-mat/9802256|Hierarchical structure in financial markets]]. The European Physical Journal B-Condensed Matter and Complex Systems, 11(1), 193–197.</ref><ref>Djauhari, M., & Gan, S. (2015). [https://www.researchgate.net/profile/Maman_Djauhari3/publication/277250327_Optimality_problem_of_network_topology_in_stocks_market_analysis/links/59ebdb7d4585151983cb7795/Optimality-problem-of-network-topology-in-stocks-market-analysis.pdf Optimality problem of network topology in stocks market analysis]. Physica A: Statistical Mechanics and Its Applications, 419, 108–114.</ref> A correlation matrix can be created by calculating a coefficient of correlation between any two stocks. This matrix can be represented topologically as a complex network and a minimum spanning tree can be constructed to visualize relationships. ==References== {{Reflist|30em}} ==Further reading== * [http://citeseer.ist.psu.edu/nesetril00otakar.html Otakar Boruvka on Minimum Spanning Tree Problem (translation of both 1926 papers, comments, history) (2000)] [[Jaroslav Nešetřil]], Eva Milková, Helena Nesetrilová. (Section 7 gives his algorithm, which looks like a cross between Prim's and Kruskal's.) * [[Thomas H. Cormen]], [[Charles E. Leiserson]], [[Ronald L. Rivest]], and [[Clifford Stein]]. ''[[Introduction to Algorithms]]'', Second Edition. MIT Press and McGraw-Hill, 2001. {{isbn|0-262-03293-7}}. Chapter 23: Minimum Spanning Trees, pp. 561–579. * Eisner, Jason (1997). [http://www.cs.jhu.edu/~jason/papers/eisner.mst-tutorial.pdf State-of-the-art algorithms for minimum spanning trees: A tutorial discussion]. Manuscript, University of Pennsylvania, April. 78 pp. * Kromkowski, John David. "Still Unmelted after All These Years", in Annual Editions, Race and Ethnic Relations, 17/e (2009 McGraw Hill) (Using minimum spanning tree as method of demographic analysis of ethnic diversity across the United States). ==External links== {{commons category|Minimum spanning trees}} * [http://www.boost.org/libs/graph/doc/table_of_contents.html Implemented in BGL, the Boost Graph Library] * [http://www.cs.sunysb.edu/~algorith/files/minimum-spanning-tree.shtml The Stony Brook Algorithm Repository - Minimum Spanning Tree codes] * [https://web.archive.org/web/20100317031339/http://www.codeplex.com/quickgraph Implemented in QuickGraph for .Net] [[Category:Spanning tree]] [[Category:Polynomial-time problems]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Anchor
(
edit
)
Template:Center
(
edit
)
Template:Citation
(
edit
)
Template:Cite Geometric Algorithms and Combinatorial Optimization
(
edit
)
Template:Cite book
(
edit
)
Template:Cite conference
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:Commons category
(
edit
)
Template:Further
(
edit
)
Template:Garey-Johnson
(
edit
)
Template:Harvtxt
(
edit
)
Template:Isbn
(
edit
)
Template:Math
(
edit
)
Template:Mvar
(
edit
)
Template:Not a typo
(
edit
)
Template:Reflist
(
edit
)
Template:Regular polygon minimum spanning tree.svg
(
edit
)
Template:Rp
(
edit
)
Template:See also
(
edit
)
Template:Sfrac
(
edit
)
Template:Short description
(
edit
)
Template:Sister project
(
edit
)
Template:Use American English
(
edit
)