Editing B+ tree (section)

==Algorithms==

===Search===
We are looking for a value {{mvar|k}} in the B+ Tree. This means that starting from the root, we are looking for the leaf which may contain the value {{mvar|k}}. At each node, we figure out which internal node we should follow. An internal B+ Tree node has at most <math> m \le b </math>  children, where every one of them represents a different sub-interval. We select the corresponding child via a linear search of the {{mvar|m}} entries, then when we finally get to a leaf, we do a linear search of its {{mvar|n}} elements for the desired key. Because we only traverse one branch of all the children at each rung of the tree, we achieve <math>O(\log N)</math> runtime, where {{mvar|N}} is the total number of keys stored in the leaves of the B+ tree.<ref name=algorithms-pdf>{{cite web |url=https://www.cs.helsinki.fi/u/mluukkai/tirak2010/B-tree.pdf|title="B+ trees"|last=Pollari-Malmi|first=Kerttu|website=Computer Science, Faculty of Science, University of Helsinki|page=3|archive-url=https://web.archive.org/web/20210414050947/https://www.cs.helsinki.fi/u/mluukkai/tirak2010/B-tree.pdf|archive-date=2021-04-14}}</ref>

 '''function''' search(''k'', ''root'') '''is'''
     '''let''' leaf = leaf_search(k, root)
     '''for''' leaf_key '''in''' leaf.keys():
         '''if''' k = leaf_key:
              '''return''' true
     '''return''' false

 '''function''' leaf_search(''k'', ''node'') '''is'''
     '''if''' node is a leaf:
         '''return''' node
     '''let''' p = node.children()
     '''let''' l = node.left_sided_intervals()
     '''assert''' <math>|p| = |l| + 1</math>
     '''let''' m = p.len()
     '''for''' i '''from''' 1 '''to''' m - 1:
         '''if''' <math>k \le l[i]</math>:
             '''return''' leaf_search(k, p[i])
     '''return''' leaf_search(k, p[m])

Note that this [[pseudocode]] uses 1-based array indexing.

===Insertion===
* Perform a search to determine which node the new record should go into.
* If the node is not full (at most <math> b - 1 </math> entries after the insertion), add the record.
* Otherwise, ''before'' inserting the new record
** Split the node.
*** original node has <math>\lceil (K+1)/2 \rceil</math> items
*** new node has <math>\lfloor (K+1)/2 \rfloor</math> items
** Copy <math>\lceil (K+1)/2 \rceil</math>-th key to the parent, and insert the new node to the parent.
** Repeat until a parent is found that need not split.
** Insert the new record into the new node.
* If the root splits, treat it as if it has an empty parent and split as outlined above.
B+ trees grow at the root and not at the leaves.<ref name=Navathe />

===Bulk-loading===
Given a collection of data records, we want to create a B+ tree index on some key field. One approach is to insert each record into an empty tree. However, it is quite expensive, because each entry requires us to start from the root and go down to the appropriate leaf page. An efficient alternative is to use bulk-loading.
* The first step is to sort the data entries according to a search key in ascending order.
* We allocate an empty page to serve as the root, and insert a pointer to the first page of entries into it.
* When the root is full, we split the root, and create a new root page.
* Keep inserting entries to the right most index page just above the leaf level, until all entries are indexed.

Note:
* when the right-most index page above the leaf level fills up, it is split;
* this action may, in turn, cause a split of the right-most index page one step closer to the root;
* splits only occur on the right-most path from the root to the leaf level.<ref>{{Cite web|url=http://web.cs.ucdavis.edu/~green/courses/ecs165b-s10/Lecture6.pdf|title=ECS 165B: Database System Implementation Lecture 6|date=April 9, 2010|website=UC Davis CS department|pages=21–23}}</ref>

=== Deletion ===
The purpose of the delete algorithm is to remove the desired entry node from the tree structure. We [[Recursive definition|recursively]] call the delete algorithm on the appropriate node until no node is found. For each function call, we traverse along, using the index to navigate until we find the node, remove it, and then work back up to the root.

At entry L that we wish to remove:
* If L is at least half-full, done
* If L has only d-1 entries, try to re-distribute, borrowing from sibling (adjacent node with same parent as L).{{pb}}After the re-distribution of two sibling nodes happens, the parent node must be updated to reflect this change. The index key that points to the second sibling must take the smallest value of that node to be the index key.
* If re-distribute fails, merge L and sibling. After merging, the parent node is updated by deleting the index key that point to the deleted entry. In other words, if merge occurred, must delete entry (pointing to L or sibling) from parent of L.

Note: merge could propagate to root, which means decreasing height.<ref>{{Cite book |last=Ramakrishnan |first=Raghu |title=Database management systems |date=2003 |publisher=McGraw-Hill |author2=Johannes Gehrke |isbn=0-07-246563-8 |edition=3rd |location=Boston |oclc=49977005}}</ref>
[[File:B+-tree-remove-61.png|thumb|B+ tree deletion|none]]