Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Van Emde Boas tree
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Tree data structure}} {{tone|date=May 2021}} {{Infobox data structure | name = van Emde Boas tree | type = Non-binary [[tree data structure|tree]] | invented_by = [[Peter van Emde Boas]] | invented_year = 1975 | space_avg = <math>O(M)</math> | space_worst = <math>O(M)</math> | search_avg = <math>O(\log \log M)</math> | search_worst = <math>O(\log \log M)</math> | insert_avg = <math>O(\log \log M)</math> | insert_worst = <math>O(\log \log M)</math> | delete_avg = <math>O(\log \log M)</math> | delete_worst = <math>O(\log \log M)</math> }} A '''van Emde Boas tree''' ({{IPA|nl|vΙn ΛΙmdΙ ΛboΛΙs}}), also known as a '''vEB tree''' or '''van Emde Boas priority queue''', is a [[tree data structure]] which implements an [[associative array]] with {{math|''m''}}-bit integer keys. It was invented by a team led by Dutch computer scientist [[Peter van Emde Boas]] in 1975.<ref>[[Peter van Emde Boas]]: ''Preserving order in a forest in less than logarithmic time'' (''Proceedings of the 16th Annual Symposium on Foundations of Computer Science'' 10: 75-84, 1975)</ref> It performs all operations in {{math|[[Big-O notation|''O'']](log ''m'')}} time (assuming that an <math>m</math> bit operation can be performed in constant time), or equivalently in <math>O(\log \log M)</math> time, where <math>M = 2^m</math> is the largest element that can be stored in the tree. The parameter <math>M</math> is not to be confused with the actual number of elements stored in the tree, by which the performance of other tree data-structures is often measured. The standard vEB tree has inadequate space efficiency. For example, for storing 32-bit integers (i.e., when <math>m = 32</math>), it requires <math>M = 2^{32}</math> bits of storage. However, similar data structures with equally good time efficiency and space efficiency of <math>O(n)</math> (where <math>n</math> is the number of stored elements) exist, and vEB trees can be modified to require only <math>O(n \log M)</math> space. ==Supported operations== A vEB supports the operations of an ''ordered [[associative array]]'', which includes the usual associative array operations along with two more ''order'' operations, ''FindNext'' and ''FindPrevious'':<ref>[[Gudmund Skovbjerg Frandsen]]: ''[http://www.daimi.au.dk/~gudmund/dynamicF04/vEB.pdf Dynamic algorithms: Course notes on van Emde Boas trees (PDF)]'' ([[University of Aarhus]], Department of Computer Science) {{dead link|date=April 2025}}</ref> * ''Insert'': insert a key/value pair with an {{math|''m''}}-bit key * ''Delete'': remove the key/value pair with a given key * ''Lookup'': find the value associated with a given key * ''FindNext'': find the key/value pair with the smallest key which is greater than a given {{math|''k''}} * ''FindPrevious'': find the key/value pair with the largest key which is smaller than a given {{math|''k''}} A vEB tree also supports the operations ''Minimum'' and ''Maximum'', which return the minimum and maximum element stored in the tree respectively.<ref> [[Thomas H. Cormen]], [[Charles E. Leiserson]], [[Ronald L. Rivest]], and [[Clifford Stein]]. ''[[Introduction to Algorithms]]'', Third Edition. [[MIT Press]], 2009. {{ISBN|978-0-262-53305-8}}. Chapter 20: The van Emde Boas tree, pp. 531β560. </ref> These both run in <math>O(1)</math> time, since the minimum and maximum element are stored as attributes in each tree. ==Function== [[Image:VebDiagram.svg|thumb |alt=Example van Emde Boas tree |An example van Emde Boas tree with dimension 5 and the root's aux structure after 1, 2, 3, 5, 8 and 10 have been inserted.]] Let <math>\log_2 m = k</math> for some integer <math>k</math>. Define <math>M = 2^m</math>. A vEB tree {{mono|T}} over the universe <math>\{0, \ldots, M - 1\}</math> has a root node that stores an array {{mono|T.children}} of length <math>\sqrt{M}</math>. {{mono|T.children[i]}} is a pointer to a vEB tree that is responsible for the values <math>\{i \sqrt{M}, \ldots, (i + 1) \sqrt{M} - 1\}</math>. Additionally, ''T'' stores two values {{mono|T.min}} and {{mono|T.max}} as well as an auxiliary vEB tree {{mono|T.aux}}. Data is stored in a vEB tree as follows: The smallest value currently in the tree is stored in {{mono|T.min}} and largest value is stored in {{mono|T.max}}. Note that {{mono|T.min}} is not stored anywhere else in the vEB tree, while {{mono|T.max}} is. If ''T'' is empty then we use the convention that {{mono|1=T.max=β1}} and {{mono|1=T.min=M}}. Any other value <math>x</math> is stored in the subtree {{mono|T.children[i]}} where <math>i = \lfloor x / \sqrt{M} \rfloor</math>. The auxiliary tree {{mono|T.aux}} keeps track of which children are non-empty, so {{mono|T.aux}} contains the value <math>j</math> if and only if {{mono|T.children[j]}} is non-empty. ===FindNext=== The operation {{mono|FindNext(T, x)}} that searches for the successor of an element ''x'' in a vEB tree proceeds as follows: If {{mono|''x''<T.min}} then the search is complete, and the answer is {{mono|T.min}}. If {{mono|xβ₯T.max}} then the next element does not exist, return M. Otherwise, let <math>i = \lfloor x / \sqrt{M} \rfloor</math>. If {{mono|x < T.children[i].max}}, then the value being searched for is contained in {{mono|T.children[i]}}, so the search proceeds recursively in {{mono|T.children[i]}}. Otherwise, we search for the successor of the value ''i'' in {{mono|T.aux}}. This gives us the index ''j'' of the first subtree that contains an element larger than ''x''. The algorithm then returns {{mono|T.children[j].min}}. The element found on the children level needs to be composed with the high bits to form a complete next element. '''function''' FindNext(T, x) '''if''' x < T.min '''then''' '''return''' T.min '''if''' x β₯ T.max '''then''' ''// no next element'' '''return''' M i = floor(x/<math>\sqrt{M}</math>) lo = x mod <math>\sqrt{M}</math> '''if''' lo < T.children[i].max '''then''' '''return''' (<math>\sqrt{M}</math> i) + FindNext(T.children[i], lo) j = FindNext(T.aux, i) '''return''' (<math>\sqrt{M}</math> j) + T.children[j].min '''end''' Note that, in any case, the algorithm performs <math>O(1)</math> work and then possibly recurses on a subtree over a universe of size <math>M^{1/2}</math> (an <math>m/2</math> bit universe). This gives a recurrence for the running time of <math>T(m) = T(m/2) + O(1)</math>, which resolves to <math>O(\log m) = O(\log \log M)</math>. ===Insert=== The call {{mono|insert(T, x)}} that inserts a value {{mono|x}} into a vEB tree {{mono|T}} operates as follows: # If ''T'' is empty then we set {{mono|1=T.min = T.max = x}} and we are done. # Otherwise, if {{mono|x<T.min}} then we insert {{mono|T.min}} into the subtree {{mono|i}} responsible for {{mono|T.min}} and then set {{mono|1=T.min = x}}. If {{mono|T.children[i]}} was previously empty, then we also insert {{mono|i}} into {{mono|T.aux}} # Otherwise, if {{mono|x>T.max}} then we insert {{mono|x}} into the subtree {{mono|i}} responsible for {{mono|x}} and then set {{mono|1=T.max = x}}. If {{mono|T.children[i]}} was previously empty, then we also insert {{mono|i}} into {{mono|T.aux}} # Otherwise, {{mono|T.min< x < T.max}} so we insert {{mono|x}} into the subtree {{mono|i}} responsible for {{mono|x}}. If {{mono|T.children[i]}} was previously empty, then we also insert {{mono|i}} into {{mono|T.aux}}. In code: '''function''' Insert(T, x) '''if''' T.min == x || T.max == x '''then''' ''// x is already inserted'' '''return''' '''if''' T.min > T.max '''then''' ''// T is empty'' T.min = T.max = x; '''return''' '''if''' x < T.min '''then''' swap(x, T.min) '''if''' x > T.max '''then''' T.max = x i = floor(x / <math>\sqrt{M}</math>) lo = x mod <math>\sqrt{M}</math> Insert(T.children[i], lo) '''if''' T.children[i].min == T.children[i].max '''then''' Insert(T.aux, i) '''end''' The key to the efficiency of this procedure is that inserting an element into an empty vEB tree takes {{math|''O''(1)}} time. So, even though the algorithm sometimes makes two recursive calls, this only occurs when the first recursive call was into an empty subtree. This gives the same running time recurrence of {{tmath|1=T(m)=T(m/2) + O(1)}} as before. ===Delete=== Deletion from vEB trees is the trickiest of the operations. The call {{mono|Delete(T, x)}} that deletes a value ''x'' from a vEB tree T operates as follows: # If {{mono|1=T.min = T.max = x}} then ''x'' is the only element stored in the tree and we set {{mono|1=T.min = M}} and {{mono|1=T.max = β1}} to indicate that the tree is empty. # Otherwise, if {{mono|1=x == T.min}} then we need to find the second-smallest value ''y'' in the vEB tree, delete it from its current location, and set {{mono|1=T.min=y}}. The second-smallest value ''y'' is {{mono|T.children[T.aux.min].min}}, so it can be found in {{math|''O''(1)}} time. We delete ''y'' from the subtree that contains it. # If {{mono|xβ T.min}} and {{mono|xβ T.max}} then we delete x from the subtree {{mono|T.children[i]}} that contains ''x''. # If {{mono|1=x == T.max}} then we will need to find the second-largest value ''y'' in the vEB tree and set {{mono|1=T.max=y}}. We start by deleting x as in previous case. Then value ''y'' is either {{mono|T.min}} or {{mono|T.children[T.aux.max].max}}, so it can be found in {{math|''O''(1)}} time. # In any of the above cases, if we delete the last element ''x'' or ''y'' from any subtree {{mono|T.children[i]}} then we also delete ''i'' from {{mono|T.aux}}. In code: '''function''' Delete(T, x) '''if''' T.min == T.max == x '''then''' T.min = M T.max = β1 '''return''' '''if''' x == T.min '''then''' hi = T.aux.min * <math>\sqrt{M}</math> j = T.aux.min T.min = x = hi + T.children[j].min i = floor(x / <math>\sqrt{M}</math>) lo = x mod <math>\sqrt{M}</math> Delete(T.children[i], lo) '''if''' T.children[i] is empty '''then''' Delete(T.aux, i) '''if''' x == T.max '''then''' '''if''' T.aux is empty '''then''' T.max = T.min '''else''' hi = T.aux.max * <math>\sqrt{M}</math> j = T.aux.max T.max = hi + T.children[j].max '''end''' Again, the efficiency of this procedure hinges on the fact that deleting from a vEB tree that contains only one element takes only constant time. In particular, the second Delete call only executes if ''x'' was the only element in {{mono|T.children[i]}} prior to the deletion. ===In practice=== The assumption that {{math|log ''m''}} is an integer is unnecessary. The operations <math>x\sqrt{M}</math> and <math>x\bmod\sqrt{M}</math> can be replaced by taking only higher-order {{math|β''m''/2β}} and the lower-order {{math|β''m''/2β}} bits of {{mvar|x}}, respectively. On any existing machine, this is more efficient than division or remainder computations. In practical implementations, especially on machines with ''shift-by-k'' and ''find first zero'' instructions, performance can further be improved by switching to a [[bit array]] once {{mvar|m}} equal to the [[Word (data type)|word size]] (or a small multiple thereof) is reached. Since all operations on a single word are constant time, this does not affect the asymptotic performance, but it does avoid the majority of the pointer storage and several pointer dereferences, achieving a significant practical savings in time and space with this trick. An optimization of vEB trees is to discard empty subtrees. This makes vEB trees quite compact when they contain many elements, because no subtrees are created until something needs to be added to them. Initially, each element added creates about {{math|log(''m'')}} new trees containing about {{math|''m''/2}} pointers all together. As the tree grows, more and more subtrees are reused, especially the larger ones. <!--In a full tree of {{math|''M''}} elements, only {{math|O(''M'')}} space is used. Moreover, unlike a binary search tree, most of this space is being used to store data: even for billions of elements, the pointers in a full vEB tree number in the thousands.--> The implementation described above uses pointers and occupies a total space of {{math|''O''(''M'') {{=}} ''O''(2<sup>''m''</sup>)}}, proportional to the size of the key universe. This can be seen as follows. The recurrence is <math> S(M) = O( \sqrt{M}) + (\sqrt{M}+1) \cdot S(\sqrt{M}) </math>. Resolving that would lead to <math> S(M) \in (1 + \sqrt{M})^{\log \log M} + \log \log M \cdot O( \sqrt{M} )</math>. One can, fortunately, also show that {{math|''S''(''M'') {{=}} ''M''β2}} by induction.<ref>{{cite web|last=Rex|first=A|title=Determining the space complexity of van Emde Boas trees|url=https://mathoverflow.net/q/2245 |access-date=27 May 2011}}</ref> ==Similar structures== The {{math|''O''(''M'')}} space usage of vEB trees is an enormous overhead unless a large fraction of the universe of keys is being stored. This is one reason why vEB trees are not popular in practice. This limitation can be addressed by changing the array used to store children to another data structure. One possibility is to use only a fixed number of bits per level, which results in a [[trie]]. Alternatively, each array may be replaced by a [[hash table]], reducing the space to {{math|''O''(''n'' log log ''M'')}} (where {{mvar|n}} is the number of elements stored in the data structure) at the expense of making the data structure randomized. [[x-fast trie]]s and the more complicated [[y-fast trie]]s have comparable update and query times to vEB trees and use randomized hash tables to reduce the space used. x-fast tries use {{math|''O''(''n'' log ''M'')}} space while y-fast tries use {{math|''O''(''n'')}} space. [[Fusion tree|Fusion trees]] are another type of tree data structure that implements an [[associative array]] on w-bit integers on a finite universe. They use word-level parallelism and bit manipulation techniques to achieve {{math|''O''(log<sub>''w''</sub> ''n'')}} time for [[Predecessor problem|predecessor/successor queries]] and updates, where {{math|''w''}} is the word size.<ref>{{Cite web |date=4 April 2019 |title=Fusion Tree |url=https://iq.opengenus.org/fusion-tree/ |access-date=30 August 2023 |website=OpenGenus IQ: Computing Expertise & Legacy |language=en}}</ref> Fusion trees use {{math|''O''(''n'')}} space and can be made dynamic with hashing or exponential trees. ==Implementations== There is a verified implementation in [[Isabelle (proof assistant)]].<ref>{{cite web |last1=Ammer |first1=Thomas |last2=Lammich |first2=Peter |title=van Emde Boas Trees |url=https://www.isa-afp.org/entries/Van_Emde_Boas_Trees.html |website=Archive of Formal Proofs |date=23 November 2021 |access-date=26 November 2021}}</ref> Both functional correctness and time bounds are proved. Efficient imperative [[Standard ML]] code can be generated. ==See also== * [[Integer sorting]] * [[Tango tree]] ==References== {{reflist}} ===Further reading=== {{refbegin}} * Erik Demaine, Sam Fingeret, Shravas Rao, Paul Christiano. Massachusetts Institute of Technology. 6.851: Advanced Data Structures (Spring 2012). [http://courses.csail.mit.edu/6.851/spring12/scribe/L11.pdf Lecture 11 notes]. March 22, 2012. * {{cite journal |author-link1=Peter van Emde Boas |last1=van Emde Boas |first1=P. |last2=Kaas |first2=R. |last3=Zijlstra |first3=E. |year=1976 |title=Design and implementation of an efficient priority queue |journal=[[Mathematical Systems Theory]] |volume=10 |pages=99β127 |doi=10.1007/BF01683268}} {{refend}} {{CS trees}} {{Authority control}} [[Category:Computer science articles needing expert attention]] [[Category:Priority queues]] [[Category:Search trees]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Ambox
(
edit
)
Template:Authority control
(
edit
)
Template:CS trees
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:DMCA
(
edit
)
Template:Dead link
(
edit
)
Template:IPA
(
edit
)
Template:ISBN
(
edit
)
Template:Infobox data structure
(
edit
)
Template:Math
(
edit
)
Template:Mono
(
edit
)
Template:Mvar
(
edit
)
Template:Refbegin
(
edit
)
Template:Refend
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Tmath
(
edit
)
Template:Tone
(
edit
)