Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Suffix tree
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==History== The concept was first introduced by {{harvtxt|Weiner|1973}}. Rather than the suffix <math>S[i..n]</math>, Weiner stored in his trie<ref>This term is used here to distinguish Weiner's precursor data structures from proper suffix trees as defined [[#Definition|above]] and unconsidered before {{harvtxt|McCreight|1976}}.</ref> the ''prefix identifier'' for each position, that is, the shortest string starting at <math>i</math> and occurring only once in <math>S</math>. His ''Algorithm D'' takes an uncompressed<ref>i.e., with each branch labelled by a single character</ref> trie for <math>S[k+1..n]</math> and extends it into a trie for <math>S[k..n]</math>. This way, starting from the trivial trie for <math>S[n..n]</math>, a trie for <math>S[1..n]</math> can be built by <math>n - 1</math> successive calls to Algorithm D; however, the overall run time is <math>O(n^2)</math>. Weiner's ''Algorithm B'' maintains several auxiliary data structures, to achieve an overall run time linear in the size of the constructed trie. The latter can still be <math>O(n^2)</math> nodes, e.g. for <math>S = a^n b^n a^n b^n \$ .</math> Weiner's ''Algorithm C'' finally uses compressed tries to achieve linear overall storage size and run time.<ref>See [[:File:WeinerB aaaabbbbaaaabbbb.gif]] and [[:File:WeinerC aaaabbbbaaaabbbb.gif]] for an uncompressed example tree and its compressed correspondant.</ref> [[Donald Knuth]] subsequently characterized the latter as "Algorithm of the Year 1973" according to his student [[Vaughan Pratt]].{{OR|reason=Weiner gave several algorithms; Knuth probably referred to Algorithm C.|date=February 2020}}{{sfnp|Giegerich|Kurtz|1997}} The text book {{harvtxt|Aho|Hopcroft|Ullman|1974|loc=Sect.9.5}} reproduced Weiner's results in a simplified and more elegant form, introducing the term ''position tree''. {{harvtxt|McCreight|1976}} was the first to build a (compressed) trie of all suffixes of <math>S</math>. Although the suffix starting at <math>i</math> is usually longer than the prefix identifier, their path representations in a compressed trie do not differ in size. On the other hand, McCreight could dispense with most of Weiner's auxiliary data structures; only suffix links remained. {{harvtxt|Ukkonen|1995}} further simplified the construction.{{sfnp|Giegerich|Kurtz|1997}} He provided the first online-construction of suffix trees, now known as [[Ukkonen's algorithm]], with running time that matched the then fastest algorithms. These algorithms are all linear-time for a constant-size alphabet, and have worst-case running time of <math>O(n\log n)</math> in general. {{harvtxt|Farach|1997}} gave the first suffix tree construction algorithm that is optimal for all alphabets. In particular, this is the first linear-time algorithm for strings drawn from an alphabet of integers in a polynomial range. Farach's algorithm has become the basis for new algorithms for constructing both suffix trees and [[suffix array]]s, for example, in external memory, compressed, succinct, etc.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)