Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Suffix tree
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Definition== The suffix tree for the string <math>S</math> of length <math>n</math> is defined as a tree such that:<ref>{{harvtxt|Gusfield|1999}}, p.90.</ref> # The tree has exactly n leaves numbered from <math>1</math> to <math>n</math>. # Except for the root, every [[Tree (data structure)#Terminology|internal node]] has at least two children. # Each edge is labelled with a non-empty substring of <math>S</math>. # No two edges starting out of a node can have string-labels beginning with the same character. # The string obtained by concatenating all the string-labels found on the path from the root to leaf <math>i</math> spells out suffix <math>S[i..n]</math>, for <math>i</math> from <math>1</math> to <math>n</math>. If a suffix of <math>S</math> is also the prefix of another suffix, such a tree does not exist for the string. For example, in the string ''abcbc'', the suffix ''bc'' is also a prefix of the suffix ''bcbc''. In such a case, the path spelling out ''bc'' will not end in a leaf, violating the fifth rule. To fix this problem, <math>S</math> is padded with a terminal symbol not seen in the string (usually denoted <code>$</code>). This ensures that no suffix is a prefix of another, and that there will be <math>n</math> leaf nodes, one for each of the <math>n</math> suffixes of <math>S</math>.<ref>{{harvtxt|Gusfield|1999}}, p.90-91.</ref> Since all internal non-root nodes are branching, there can be at most <math>n-1</math> such nodes, and <math>n+(n-1)+1=2n</math> nodes in total (<math>n</math> leaves, <math>n-1</math> internal non-root nodes, 1 root). '''Suffix links''' are a key feature for older linear-time construction algorithms, although most newer algorithms, which are based on [[Farach's algorithm]], dispense with suffix links. In a complete suffix tree, all internal non-root nodes have a suffix link to another internal node. If the path from the root to a node spells the string <math>\chi\alpha</math>, where <math>\chi</math> is a single character and <math>\alpha</math> is a string (possibly empty), it has a suffix link to the internal node representing <math>\alpha</math>. See for example the suffix link from the node for <code>ANA</code> to the node for <code>NA</code> in the figure above. Suffix links are also used in some algorithms running on the tree. A [[generalized suffix tree]] is a suffix tree made for a set of strings instead of a single string. It represents all suffixes from this set of strings. Each string must be terminated by a different termination symbol.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)