Editing Huffman coding (section)

== Variations ==
Many variations of Huffman coding exist,<ref>{{cite book |first=J. |last=Abrahams |others=Division of Mathematics, Computer & Information Sciences, [[Office of Naval Research]] (ONR) |title=Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171) |chapter=Code and parse trees for lossless source encoding |location=Arlington, VA, USA |pages=145–171 |date=1997-06-11 |isbn=0-8186-8132-2 |publication-place=Salerno |doi=10.1109/SEQUEN.1997.666911 |publisher=[[IEEE]] |citeseerx=10.1.1.589.4726 |s2cid=124587565 }}</ref> some of which use a Huffman-like algorithm, and others of which find optimal prefix codes (while, for example, putting different restrictions on the output). Note that, in the latter case, the method need not be Huffman-like, and, indeed, need not even be [[polynomial time]].

=== ''n''-ary Huffman coding ===
The '''''n''-ary Huffman''' algorithm uses an alphabet of size ''n'', typically {0, 1, ..., n-1}, to encode messages and build an ''n''-ary tree. This approach was considered by Huffman in his original paper. The same algorithm applies as for binary (<math alt="''n'' equals 2">n = 2</math>) codes, but instead of combining the two least likely symbols, the ''n'' least likely symbols are grouped together. 

Note that for ''n'' > 2, not all sets of source words can properly form a complete ''n''-ary tree for Huffman coding. In these cases, additional placeholder symbols with 0 probability may need to be added. This is because the structure of the tree needs to repeatedly join ''n'' branches into one - also known as an "''n'' to 1" combination. For binary coding, this is a "2 to 1" combination, which works with any number of symbols. For ''n''-ary coding, a complete tree is only possible when the total number of symbols (real + placeholders) leaves a remainder of 1 when divided by (n-1). <ref name=":0" />

=== Adaptive Huffman coding ===
A variation called '''[[adaptive Huffman coding]]''' involves calculating the probabilities dynamically based on recent actual frequencies in the sequence of source symbols, and changing the coding tree structure to match the updated probability estimates. It is used rarely in practice, since the cost of updating the tree makes it slower than optimized [[Arithmetic coding#Adaptive arithmetic coding|adaptive arithmetic coding]], which is more flexible and has better compression.

=== Huffman template algorithm ===
Most often, the weights used in implementations of Huffman coding represent numeric probabilities, but the algorithm given above does not require this; it requires only that the weights form a [[total order|totally ordered]] [[Monoid#Commutative monoid|commutative monoid]], meaning a way to order weights and to add them. The '''Huffman template algorithm''' enables one to use any kind of weights (costs, frequencies, pairs of weights, non-numerical weights) and one of many combining methods (not just addition). Such algorithms can solve other minimization problems, such as minimizing <math>\max_i\left[w_{i}+\mathrm{length}\left(c_{i}\right)\right]</math>, a problem first applied to circuit design.

=== {{anchor|Length-limited Huffman coding|Minimum variance Huffman coding}}Length-limited Huffman coding/minimum variance Huffman coding ===
'''Length-limited Huffman coding''' is a variant where the goal is still to achieve a minimum weighted path length, but there is an additional restriction that the length of each codeword must be less than a given constant. The [[package-merge algorithm]] solves this problem with a simple [[Greedy algorithm|greedy]] approach very similar to that used by Huffman's algorithm. Its time complexity is <math>O(nL)</math>, where <math>L</math> is the maximum length of a codeword. No algorithm is known to solve this problem in [[Big O notation#Orders of common functions|<math>O(n)</math> or <math>O(n\log n)</math>]] time, unlike the presorted and unsorted conventional Huffman problems, respectively.

=== Huffman coding with unequal letter costs ===
In the standard Huffman coding problem, it is assumed that each symbol in the set that the code words are constructed from has an equal cost to transmit: a code word whose length is ''N'' digits will always have a cost of ''N'', no matter how many of those digits are 0s, how many are 1s, etc. When working under this assumption, minimizing the total cost of the message and minimizing the total number of digits are the same thing.

''Huffman coding with unequal letter costs'' is the generalization without this assumption: the letters of the encoding alphabet may have non-uniform lengths, due to characteristics of the transmission medium. An example is the encoding alphabet of [[Morse code]], where a 'dash' takes longer to send than a 'dot', and therefore the cost of a dash in transmission time is higher. The goal is still to minimize the weighted average codeword length, but it is no longer sufficient just to minimize the number of symbols used by the message. No algorithm is known to solve this in the same manner or with the same efficiency as conventional Huffman coding, though it has been solved by [[Richard M. Karp]]<ref>{{Cite journal |last=Karp |first=Richard M. |author-link=Richard M. Karp |title=Minimum-redundancy coding for the discrete noiseless channel |url=https://ieeexplore.ieee.org/document/1057615 |journal=IRE Transactions on Information Theory |publisher=IEEE |publication-date=31 January 1961 |volume=7 |issue=1 |pages=27–38 |doi=10.1109/TIT.1961.1057615 |url-access=subscription }}</ref> whose solution has been refined for the case of integer costs by Mordecai J. Golin.<ref>{{Cite journal |last=Golin |first=Mordekai J. |date=January 1998 |publication-date=1 September 1998 |title=A Dynamic Programming Algorithm for Constructing Optimal Prefix-Free Codes with Unequal Letter Costs |url=https://page.mi.fu-berlin.de/rote/Papers/pdf/A+dynamic+programming+algorithm+for+constructing+optimal+prefix-free+codes+for+unequal+letter+costs.pdf |access-date=10 September 2024 |s2cid=2265146 |doi=10.1109/18.705558 |journal=IEEE Transactions on Information Theory |volume=44 |issue=5 |pages=1770–1781}}</ref>

=== {{anchor|Hu-Tucker coding|Alphabetic Huffman coding|Alphabetic Huffman tree}}Optimal alphabetic binary trees (Hu–Tucker coding) ===
In the standard Huffman coding problem, it is assumed that any codeword can correspond to any input symbol. In the alphabetic version, the alphabetic order of inputs and outputs must be identical. Thus, for example, <math>A = \left\{a,b,c\right\}</math> could not be assigned code <math>H\left(A,C\right) = \left\{00,1,01\right\}</math>, but instead should be assigned either <math>H\left(A,C\right) =\left\{00,01,1\right\}</math> or <math>H\left(A,C\right) = \left\{0,10,11\right\}</math>. This is also known as the '''Hu–Tucker''' problem, after [[T. C. Hu]] and [[Alan Tucker]], the authors of the paper presenting the first [[Time complexity#Quasilinear time|<math>O(n\log n)</math>-time]] solution to this optimal binary alphabetic problem,<ref>{{Cite journal | doi = 10.1137/0121057| title = Optimal Computer Search Trees and Variable-Length Alphabetical Codes| journal = SIAM Journal on Applied Mathematics| volume = 21| issue = 4| pages = 514| year = 1971| last1 = Hu | first1 = T. C.|author1-link=T. C. Hu |last2 = Tucker | first2 = A. C. | author2-link = Alan Tucker| jstor = 2099603}}</ref> which has some similarities to Huffman algorithm, but is not a variation of this algorithm. A later method, the [[Garsia–Wachs algorithm]] of [[Adriano Garsia]] and [[Michelle L. Wachs]] (1977), uses simpler logic to perform the same comparisons in the same total time bound. These optimal alphabetic binary trees are often used as [[binary search tree]]s.<ref>{{citation
 | last = Knuth | first = Donald E. | author-link = Donald Knuth
 | contribution = Algorithm G (Garsia–Wachs algorithm for optimum binary trees)
 | edition = 2nd
 | pages = 451–453
 | publisher = Addison–Wesley
 | title = The Art of Computer Programming, Vol. 3: Sorting and Searching
 | year = 1998}}. See also History and bibliography, pp. 453–454.</ref>

=== The canonical Huffman code ===
{{main|canonical Huffman code|l1=Canonical Huffman code}}
If weights corresponding to the alphabetically ordered inputs are in numerical order, the Huffman code has the same lengths as the optimal alphabetic code, which can be found from calculating these lengths, rendering Hu–Tucker coding unnecessary. The code resulting from numerically (re-)ordered input is sometimes called the ''canonical Huffman code'' and is often the code used in practice, due to ease of encoding/decoding. The technique for finding this code is sometimes called '''Huffman–Shannon–Fano coding''', since it is optimal like Huffman coding, but alphabetic in weight probability, like [[Shannon–Fano coding]]. The Huffman–Shannon–Fano code corresponding to the example is <math>\{000,001,01,10,11\}</math>, which, having the same codeword lengths as the original solution, is also optimal. But in ''canonical Huffman code'', the result is <math>\{110,111,00,01,10\}</math>.