Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Euclidean minimum spanning tree
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Computational complexity== For points in any dimension, the minimum spanning tree can be constructed in time <math>O(n^2)</math> by constructing a [[complete graph]] with an edge between every pair of points, weighted by [[Euclidean distance]], and then applying a graph minimum spanning tree algorithm such as the [[Prim's algorithm|Prim–Dijkstra–Jarník algorithm]] or [[Borůvka's algorithm]] on it. These algorithms can be made to take time <math>O(n^2)</math> on complete graphs, unlike another common choice, [[Kruskal's algorithm]], which is slower because it involves sorting all distances.{{r|eppstein}} For points in low-dimensional spaces, the problem may be solved more quickly, as detailed below. Computing Euclidean distances involves a [[square root]] calculation. In any comparison of edge weights, comparing the squares of the Euclidean distances, instead of the distances themselves, yields the same ordering, and so does not change the rest of the tree's computation. This shortcut speeds up calculation and allows a minimum spanning tree for points with integer coordinates to be constructed using only integer arithmetic.{{r|yao82}} ===Two dimensions=== A faster approach to finding the minimum spanning tree of planar points uses the property that it is a subgraph of the Delaunay triangulation: # Compute the Delaunay triangulation, which can be done in <math>O(n\log n)</math> time. Because the Delaunay triangulation is a [[planar graph]], it has at most <math>3n-6</math> edges. # Label each edge with its (squared) length. # Run a graph minimum spanning tree algorithm. Since there are <math>O(n)</math> edges, this requires <math>O(n\log n)</math> time using any of the standard minimum spanning tree algorithms. The result is an algorithm taking <math>O(n\log n)</math> time,{{r|shahoe}} optimal in certain models of computation (see [[#Lower bound|below]]). If the input coordinates are integers and can be used as [[array index|array indices]], faster algorithms are possible: the Delaunay triangulation can be constructed by a [[randomized algorithm]] in <math>O(n\log\log n)</math> expected time.{{r|bm09}} Additionally, since the Delaunay triangulation is a [[planar graph]], its minimum spanning tree can be found in [[linear time]] by a variant of Borůvka's algorithm that removes all but the cheapest edge between each pair of components after each stage of the algorithm.{{r|eppstein|mares}} Therefore, the total expected time for this algorithm is <math>O(n\log\log n)</math>.{{r|bm09}} In the other direction, the Delaunay triangulation can be constructed from the minimum spanning tree in the near-linear time bound <math>O(n\log^* n)</math>, where <math>\log^*</math> denotes the [[iterated logarithm]].{{r|devillers}} ===Higher dimensions=== The problem can also be generalized to <math>n</math> points in the <math>d</math>-dimensional space <math>\R^d</math>. In higher dimensions, the connectivity determined by the Delaunay triangulation (which, likewise, partitions the [[convex hull]] into <math>d</math>-dimensional [[simplex|simplices]]) contains the minimum spanning tree; however, the triangulation might contain the complete graph.{{r|aesw}} Therefore, finding the Euclidean minimum spanning tree as a spanning tree of the complete graph or as a spanning tree of the Delaunay triangulation both take <math>O(dn^2)</math> time. For three dimensions the minimum spanning tree can be found in time <math>O\bigl((n\log n)^{4/3}\bigr)</math>, and in any greater dimension, in time <math display=block>O\left(n^{2-\frac{2}{\lceil d/2\rceil+1}+\varepsilon}\right)</math> for any <math>\varepsilon>0</math>—faster than the quadratic time bound for the complete graph and Delaunay triangulation algorithms.{{r|aesw}} The optimal time complexity for higher-dimensional minimum spanning trees remains unknown,{{r|topp}} but is closely related to the complexity of computing ''bichromatic closest pairs''. In the bichromatic closest pair problem, the input is a set of points, given two different colors (say, red and blue). The output is a pair of a red point and a blue point with the minimum possible distance. This pair always forms one of the edges in the minimum spanning tree. Therefore, the bichromatic closest pair problem can be solved in the amount of time that it takes to construct a minimum spanning tree and scan its edges for the shortest red–blue edge. Conversely, for any red–blue coloring of any subset of a given set of points, the bichromatic closest pair produces one edge of the minimum spanning tree of the subset. By carefully choosing a sequence of colorings of subsets, and finding the bichromatic closest pair of each subproblem, the minimum spanning tree may be found in time proportional to the optimal time for finding bichromatic closest pairs for the same number of points, whatever that optimal time turns out to be.{{r|aesw|kln}} For uniformly random point sets in any bounded dimension, the [[Yao graph]]{{r|yao82}} or Delaunay triangulation have linear expected numbers of edges, are guaranteed to contain the minimum spanning tree, and can be constructed in linear expected time.{{r|bwy|clarkson|dwyer}} From these graphs, the minimum spanning tree itself may be constructed in linear time, by using a [[Expected linear time MST algorithm|randomized linear time algorithm for graph minimum spanning trees]].{{r|kkt}} However, the poor performance of these methods on inputs coming from clustered data has led [[algorithm engineering]] researchers to develop methods with a somewhat slower <math>O(n\log n)</math> time bound, for random inputs or inputs whose distances and clustering resemble those of random data, while exhibiting better performance on real-world data.{{r|nzz|cck|mrg}} A [[well-separated pair decomposition]] is a family of pairs of subsets of the given points, so that every pair of points belong to one of these pairs of subsets, and so that all pairs of points coming from the same pair of subsets have approximately the same length. It is possible to find a well-separated pair decomposition with a linear number of subsets, and a representative pair of points for each subset, in time <math>O(n\log n)</math>. The minimum spanning tree of the graph formed by these representative pairs is then an approximation to the minimum spanning tree. Using these ideas, a <math>(1+\varepsilon)</math>-approximation to the minimum spanning tree may be found in <math>O(n\log n)</math> time, for constant <math>\varepsilon</math>. More precisely, by choosing each representative pair to approximate the closest pair in its equivalence class, and carefully varying the quality of this approximation for different pairs, the dependence on <math>\varepsilon</math> in the time bound can be given as <math>O(n \log n + (\varepsilon^{-2} \log ^2 \tfrac{1}{\varepsilon})n),</math> for any fixed dimension.{{r|arymou}} ===Dynamic and kinetic=== The Euclidean minimum spanning tree has been generalized in many different ways to systems of moving or changing points: *If a set of points undergoes a sequence of dynamic insertions or deletions of points, each of these updates induces a bounded amount of change to the minimum spanning tree of the points. When the update sequence is known in advance, for points in the plane, the change after each insertion or deletion can be found in time <math>O(\log^2 n)</math> per insertion or deletion.{{r|offline}} When the updates must be handled in an [[Online algorithm|online]] manner, a slower (but still poly-logarithmic) <math>O(\log^{10} n)</math> time bound is known.{{r|chadyn}} For higher-dimensional versions of the problem the time per update is slower, but still sublinear.{{r|extrema}} *For <math>n</math> points moving linearly with constant speed, or with more general algebraic motions, the minimum spanning tree will change by a sequence of swaps, in which one edge is removed and another replaces it at a point in time where both have equal length.{{r|kti}} For linear motions, the number of changes is at most slightly larger than <math>n^{25/9}</math>.{{r|chalev}} For more general algebraic motions, there is a near-cubic upper bound on the number of swaps, based on the theory of [[Davenport–Schinzel sequence]]s.{{r|rzcomb}} *The ''minimum moving spanning tree problem'' again concerns points moving linearly with constant speed, over an interval of time, and seeks a single tree that minimizes the maximum sum of weights occurring at any instant during this interval. It is [[NP-hard]] to compute exactly, but can be approximated to within a factor of two in polynomial time.{{r|abb}} *The [[kinetic Euclidean minimum spanning tree]] problem asks for a [[kinetic data structure]] that can maintain the minimum spanning tree as its points undergo both continuous motions and insertions and deletions. Several papers have studied such structures,{{r|bgz|aegh|rzkin|rakwz|msvw}} and a kinetic structure for algebraically moving points with near-cubic total time, nearly matching the bound on the number of swaps, is known.{{r|rakwz}} ===Lower bound=== An asymptotic lower bound of <math>\Omega(n\log n)</math> of the Euclidean minimum spanning tree problem can be established in restricted models of computation. These include the [[algebraic decision tree]] and [[algebraic computation tree]] models, in which the algorithm has access to the input points only through certain restricted primitives that perform simple algebraic computations on their coordinates. In these models, the [[closest pair of points problem]] requires <math>\Omega(n\log n)</math> time, but the closest pair is necessarily an edge of the minimum spanning tree, so the minimum spanning tree also requires this much time. Therefore, algorithms for constructing the planar minimum spanning tree in time <math>O(n\log n)</math> within this model, for instance by using the Delaunay triangulation, are optimal.{{r|yao89}} However, these lower bounds do not apply to models of computation with integer point coordinates, in which [[bitwise operation]]s and [[Random access|table indexing]] operations on those coordinates are permitted. In these models, faster algorithms are possible, as described above.{{r|bm09}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)