Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Scapegoat tree
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Insertion=== Insertion is implemented with the same basic ideas as an [[Binary search tree#Insertion|unbalanced binary search tree]], however with a few significant changes. When finding the insertion point, the depth of the new node must also be recorded. This is implemented via a simple counter that gets incremented during each iteration of the lookup, effectively counting the number of edges between the root and the inserted node. If this node violates the α-height-balance property (defined above), a rebalance is required. To rebalance, an entire subtree rooted at a '''scapegoat''' undergoes a balancing operation. The scapegoat is defined as being an ancestor of the inserted node which isn't α-weight-balanced. There will always be at least one such ancestor. Rebalancing any of them will restore the α-height-balanced property. One way of finding a scapegoat, is to climb from the new node back up to the root and select the first node that isn't α-weight-balanced. Climbing back up to the root requires <math>O(\log n)</math> storage space, usually allocated on the stack, or parent pointers. This can actually be avoided by pointing each child at its parent as you go down, and repairing on the walk back up. To determine whether a potential node is a viable scapegoat, we need to check its α-weight-balanced property. To do this we can go back to the definition: size(left) ≤ α*size(node) size(right) ≤ α*size(node) However a large optimisation can be made by realising that we already know two of the three sizes, leaving only the third to be calculated. Consider the following example to demonstrate this. Assuming that we're climbing back up to the root: size(parent) = size(node) + size(sibling) + 1 But as: size(inserted node) = 1. The case is trivialized down to: size[x+1] = size[x] + size(sibling) + 1 Where x = this node, x + 1 = parent and size(sibling) is the only function call actually required. Once the scapegoat is found, the subtree rooted at the scapegoat is completely rebuilt to be perfectly balanced.<ref name=galperin_rivest/> This can be done in <math>O(n)</math> time by traversing the nodes of the subtree to find their values in sorted order and recursively choosing the median as the root of the subtree. As rebalance operations take <math>O(n)</math> time (dependent on the number of nodes of the subtree), insertion has a worst-case performance of <math>O(n)</math> time. However, because these worst-case scenarios are spread out, insertion takes <math>O(\log n)</math> amortized time. ====Sketch of proof for cost of insertion==== Define the Imbalance of a node ''v'' to be the absolute value of the difference in size between its left node and right node minus 1, or 0, whichever is greater. In other words: <math>I(v) = \operatorname{max}(|\operatorname{left}(v) - \operatorname{right}(v)| - 1, 0) </math> Immediately after rebuilding a subtree rooted at ''v'', I(''v'') = 0. '''Lemma:''' Immediately before rebuilding the subtree rooted at ''v'', <br /> <math>I(v) \in \Omega (|v|) </math><br /> (<math>\Omega </math> is [[Big Omega notation]].) Proof of lemma: Let <math>v_0</math> be the root of a subtree immediately after rebuilding. <math>h(v_0) = \log(|v_0| + 1) </math>. If there are <math>\Omega (|v_0|)</math> degenerate insertions (that is, where each inserted node increases the height by 1), then <br /> <math>I(v) \in \Omega (|v_0|) </math>,<br /> <math>h(v) = h(v_0) + \Omega (|v_0|) </math> and<br /> <math>\log(|v|) \le \log(|v_0| + 1) + 1 </math>. Since <math>I(v) \in \Omega (|v|)</math> before rebuilding, there were <math>\Omega (|v|)</math> insertions into the subtree rooted at <math>v</math> that did not result in rebuilding. Each of these insertions can be performed in <math>O(\log n)</math> time. The final insertion that causes rebuilding costs <math>O(|v|)</math>. Using [[aggregate analysis]] it becomes clear that the amortized cost of an insertion is <math>O(\log n)</math>: <math>{\Omega (|v|) O(\log n) + O(|v|) \over \Omega (|v|)} = O(\log n) </math>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)