Editing Suffix tree (section)

==Implementation==
If each node and edge can be represented in <math>\Theta(1)</math> space, the entire tree can be represented in <math>\Theta(n)</math> space. The total length of all the strings on all of the edges in the tree is <math>O(n^2)</math>, but each edge can be stored as the position and length of a substring of {{mvar|S}}, giving a total space usage of <math>\Theta(n)</math> computer words. The worst-case space usage of a suffix tree is seen with a [[fibonacci word]], giving the full <math>2n</math> nodes.

An important choice when making a suffix tree implementation is the parent-child relationships between nodes. The most common is using [[linked list]]s called '''sibling lists'''. Each node has a pointer to its first child, and to the next node in the child list it is a part of. Other implementations with efficient running time properties use [[hash map]]s, sorted or unsorted [[array data structure|array]]s (with [[dynamic array|array doubling]]), or [[Self-balancing binary search tree|balanced search tree]]s. We are interested in:
* The cost of finding the child on a given character.
* The cost of inserting a child.
* The cost of enlisting all children of a node (divided by the number of children in the table below).

Let {{mvar|&sigma;}} be the size of the alphabet. Then you have the following costs:{{citation needed|date=October 2024}}
{| class="wikitable"
!
!Lookup
!Insertion
!Traversal
|-
| style="text-align: right;" |Sibling lists / unsorted arrays
|{{math|''O''(''σ'')}}
|{{math|Θ(1)}}
|{{math|Θ(1)}}
|-
| style="text-align: right;" |Bitwise sibling trees
|{{math|''O''(log ''σ'')}}
|{{math|Θ(1)}}
|{{math|Θ(1)}}
|-
| style="text-align: right;" |Hash maps
|{{math|Θ(1)}}
|{{math|Θ(1)}}
|{{math|''O''(''σ'')}}
|-
| style="text-align: right;" |Balanced search tree
|{{math|''O''(log ''σ'')}}
|{{math|''O''(log ''σ'')}}
|{{math|''O''(1)}}
|-
| style="text-align: right;" |Sorted arrays
|{{math|''O''(log ''σ'')}}
|{{math|''O''(''σ'')}}
|{{math|''O''(1)}}
|-
| style="text-align: right;" |Hash maps + sibling lists
|{{math|''O''(1)}}
|{{math|''O''(1)}}
|{{math|''O''(1)}}
|}

The insertion cost is amortised, and that the costs for hashing are given for perfect hashing.

The large amount of information in each edge and node makes the suffix tree very expensive, consuming about 10 to 20 times the memory size of the source text in good implementations. The [[suffix array]] reduces this requirement to a factor of 8 (for array including [[LCP array|LCP]] values built within 32-bit address space and 8-bit characters.) This factor depends on the properties and may reach 2 with usage of 4-byte wide characters (needed to contain any symbol in some [[UNIX-like]] systems, see [[wchar_t]]<!--- Don't replace by "wchar t": the type name in the C programming language uses the "_". --->) on 32-bit systems.{{citation needed|date=October 2024}} Researchers have continued to find smaller indexing structures.