Editing Suffix tree (section)

==External construction==

Though linear, the memory usage of a suffix tree is significantly higher
than the actual size of the sequence collection.  For a large text,
construction may require external memory approaches.

There are theoretical results for constructing suffix trees in external
memory.
The algorithm by {{harvtxt|Farach-Colton|Ferragina|Muthukrishnan|2000}}
is theoretically optimal, with an I/O complexity equal to that of sorting.
However the overall intricacy of this algorithm has prevented, so far, its
practical implementation.{{sfnp|Smyth|2003}}

On the other hand, there have been practical works for constructing
disk-based suffix trees
which scale to (few) GB/hours.
The state of the art methods are TDD,<ref name="tdd">{{harvtxt|Tata|Hankins|Patel|2003}}.</ref>
TRELLIS,<ref name="trellis">{{harvtxt|Phoophakdee|Zaki|2007}}.</ref>
DiGeST,<ref name="digest">{{harvtxt|Barsky|Stege|Thomo|Upton|2008}}.</ref>
and
B<sup>2</sup>ST.<ref name="b2st">{{harvtxt|Barsky|Stege|Thomo|Upton|2009}}.</ref>

TDD and TRELLIS scale up to the entire human genome resulting in a disk-based suffix tree of a size in the tens of gigabytes.<ref name="tdd" /><ref name="trellis" /> However, these methods cannot handle efficiently collections of sequences exceeding 3&nbsp;GB.<ref name="digest" />  DiGeST performs significantly better and is able to handle collections of sequences in the order of 6&nbsp;GB in about 6 hours.<ref name="digest" />

All these methods can efficiently build suffix trees for the case when the
tree does not fit in main memory,
but the input does.
The most recent method, B<sup>2</sup>ST,<ref name="b2st" /> scales to handle
inputs that do not fit in main memory. ERA  is a recent parallel suffix tree construction method that is significantly faster. ERA can index the entire human genome in 19 minutes on an 8-core desktop computer with 16&nbsp;GB RAM. On a simple Linux cluster with 16 nodes (4&nbsp;GB RAM per node), ERA can index the entire human genome in less than 9 minutes.{{sfnp|Mansour|Allam|Skiadopoulos|Kalnis|2011}}