Editing Burrows–Wheeler transform (section)

==Optimization==
A number of [[Optimization (computer science)|optimizations]] can make these algorithms run more efficiently without changing the output. There is no need to represent the table in either the encoder or decoder. In the encoder, each row of the table can be represented by a single pointer into the strings, and the sort performed using the indices. In the decoder, there is also no need to store the table, and the decoded string can be generated one character at a time from left to right. Comparative sorting can even be avoided in favor of linear sorting, with performance proportional to the alphabet size and string length. A "character" in the algorithm can be a byte, or a bit, or any other convenient size.

One may also make the observation that mathematically, the encoded string can be computed as a simple modification of the [[suffix array]], and suffix arrays can be computed with linear time and memory. The BWT can be defined with regards to the suffix array SA of text T as (1-based indexing):

<math display=block>BWT[i] = \begin{cases} T[SA[i]-1], & \text{if }SA[i] > 0\\ \$, & \text{otherwise}\end{cases}</math><ref>{{Cite journal|last1=Simpson|first1=Jared T.|last2=Durbin|first2=Richard|date=2010-06-15|title=Efficient construction of an assembly string graph using the FM-index|journal=Bioinformatics|language=en|volume=26|issue=12|pages=i367–i373|doi=10.1093/bioinformatics/btq217|issn=1367-4803|pmc=2881401|pmid=20529929}}</ref>

There is no need to have an actual 'EOF' character. Instead, a pointer can be used that remembers where in a string the 'EOF' would be if it existed. In this approach, the output of the BWT must include both the transformed string, and the final value of the pointer. The inverse transform then shrinks it back down to the original size: it is given a string and a pointer, and returns just a string.

A complete description of the algorithms can be found in Burrows and Wheeler's paper, or in a number of online sources.<ref name=Burrows1994/> The algorithms vary somewhat by whether EOF is used, and in which direction the sorting was done. In fact, the original formulation did not use an EOF marker.<ref name=Manzini>{{Cite book |chapter-url=https://people.unipmn.it/manzini/papers/mfcs99x.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://people.unipmn.it/manzini/papers/mfcs99x.pdf |archive-date=2022-10-09 |url-status=live <!-- |url=https://books.google.com/books?id=OcJjpqAi15EC&pg=PA34e --> |title=Mathematical Foundations of Computer Science 1999: 24th International Symposium, MFCS'99 Szklarska Poreba, Poland, September 6-10, 1999 Proceedings |chapter=The Burrows–Wheeler Transform: Theory and Practice |last=Manzini |first=Giovanni |date=1999-08-18 |publisher=Springer Science & Business Media |isbn=9783540664086 |language=en}}</ref>