Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Disjoint-set data structure
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Merging two sets === [[File:Dsu disjoint sets init.svg|thumb|360px|<code>MakeSet</code> creates 8 singletons.]] [[File:Dsu disjoint sets final.svg|thumb|360px|After some operations of <code>Union</code>, some sets are grouped together.]] The operation <code>Union(''x'', ''y'')</code> replaces the set containing {{mvar|x}} and the set containing {{mvar|y}} with their union. <code>Union</code> first uses <code>Find</code> to determine the roots of the trees containing {{mvar|x}} and {{mvar|y}}. If the roots are the same, there is nothing more to do. Otherwise, the two trees must be merged. This is done by either setting the parent pointer of {{mvar|x}}'s root to {{mvar|y}}'s, or setting the parent pointer of {{mvar|y}}'s root to {{mvar|x}}'s. The choice of which node becomes the parent has consequences for the complexity of future operations on the tree. If it is done carelessly, trees can become excessively tall. For example, suppose that <code>Union</code> always made the tree containing {{mvar|x}} a subtree of the tree containing {{mvar|y}}. Begin with a forest that has just been initialized with elements <math>1, 2, 3, \ldots, n,</math> and execute <code>{{math|Union(1, 2)}}</code>, <code>{{math|Union(2, 3)}}</code>, ..., <code>{{math|Union(''n'' - 1, ''n'')}}</code>. The resulting forest contains a single tree whose root is {{mvar|n}}, and the path from 1 to {{mvar|n}} passes through every node in the tree. For this forest, the time to run <code>Find(1)</code> is {{math|''O''(''n'')}}. In an efficient implementation, tree height is controlled using '''union by size''' or '''union by rank'''. Both of these require a node to store information besides just its parent pointer. This information is used to decide which root becomes the new parent. Both strategies ensure that trees do not become too deep. ====Union by size==== In the case of union by size, a node stores its size, which is simply its number of descendants (including the node itself). When the trees with roots {{mvar|x}} and {{mvar|y}} are merged, the node with more descendants becomes the parent. If the two nodes have the same number of descendants, then either one can become the parent. In both cases, the size of the new parent node is set to its new total number of descendants. '''function''' Union(''x'', ''y'') '''is''' ''// Replace nodes by roots'' ''x'' := Find(''x'') ''y'' := Find(''y'') '''if''' ''x'' = ''y'' '''then''' '''return''' ''// x and y are already in the same set'' '''end if''' ''// If necessary, swap variables to ensure that'' ''// x has at least as many descendants as y'' '''if''' ''x''.size < ''y''.size '''then''' (''x'', ''y'') := (''y'', ''x'') '''end if''' ''// Make x the new root'' ''y''.parent := ''x'' ''// Update the size of x'' ''x''.size := ''x''.size + ''y''.size '''end function''' The number of bits necessary to store the size is clearly the number of bits necessary to store {{mvar|n}}. This adds a constant factor to the forest's required storage. ====Union by rank==== For union by rank, a node stores its {{em|rank}}, which is an upper bound for its height. When a node is initialized, its rank is set to zero. To merge trees with roots {{mvar|x}} and {{mvar|y}}, first compare their ranks. If the ranks are different, then the larger rank tree becomes the parent, and the ranks of {{mvar|x}} and {{mvar|y}} do not change. If the ranks are the same, then either one can become the parent, but the new parent's rank is incremented by one. While the rank of a node is clearly related to its height, storing ranks is more efficient than storing heights. The height of a node can change during a <code>Find</code> operation, so storing ranks avoids the extra effort of keeping the height correct. In pseudocode, union by rank is: '''function''' Union(''x'', ''y'') '''is''' ''// Replace nodes by roots'' ''x'' := Find(''x'') ''y'' := Find(''y'') '''if''' ''x'' = ''y'' '''then''' '''return''' ''// x and y are already in the same set'' '''end if''' ''// If necessary, rename variables to ensure that'' ''// x has rank at least as large as that of y'' '''if''' ''x''.rank < ''y''.rank '''then''' (''x'', ''y'') := (''y'', ''x'') '''end if''' ''// Make x the new root'' ''y''.parent := ''x'' ''// If necessary, increment the rank of x'' '''if''' ''x''.rank = ''y''.rank '''then''' ''x''.rank := ''x''.rank + 1 '''end if''' '''end function''' It can be shown that every node has rank <math>\lfloor \log n \rfloor</math> or less.<ref name="Cormen2009"/> Consequently each rank can be stored in {{math|''O''(log log ''n'')}} bits and all the ranks can be stored in {{math|''O''(''n'' log log ''n'')}} bits. This makes the ranks an asymptotically negligible portion of the forest's size. It is clear from the above implementations that the size and rank of a node do not matter unless a node is the root of a tree. Once a node becomes a child, its size and rank are never accessed again.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)