Editing Elias omega coding

{{Short description|Universal code encoding positive integers}}
{{Use dmy dates|date=May 2019|cs1-dates=y}}
'''Elias ω coding''' or '''Elias omega coding''' is a [[universal code (data compression)|universal code]] encoding the positive integers developed by [[Peter Elias]]. Like [[Elias gamma coding]] and [[Elias delta coding]], it works by prefixing the positive integer with a representation of its [[order of magnitude]] in a universal code. Unlike those other two codes, however, Elias omega recursively encodes that prefix; thus, they are sometimes known as '''recursive Elias codes'''.

Omega coding is used in applications where the largest encoded value is not known ahead of time, or to [[Data compression|compress]] data in which small values are much more frequent than large values.

To encode a [[positive integer]] ''N'':
#Place a "0" at the end of the code.
#If ''N''&nbsp;= 1, stop; encoding is complete.
#Prepend the [[binary numeral system|binary]] representation of ''N'' to the beginning of the code. This will be at least two bits, the first bit of which is a 1.
#Let ''N'' equal the number of bits just prepended, minus one.
#Return to Step 2 to prepend the encoding of the new ''N''.

To decode an Elias omega-encoded positive integer:
#Start with a variable ''N'', set to a value of 1.
#If the next bit is a "0" then stop.  The decoded number is ''N''.
#If the next bit is a "1" then read it plus ''N'' more bits, and use that binary number as the new value of ''N''.  Go back to Step 2.

==Examples==
Omega codes can be thought of as a number of "groups". A group is either a single 0 bit, which terminates the code, or two or more bits beginning with 1, which is followed by another group.

The first few codes are shown below. Included is the so-called ''implied distribution'', describing the distribution of values for which this coding yields a minimum-size code; see [[Universal code (data compression)#Relationship to practical compression|Relationship of universal codes to practical compression]] for details.

{| class="wikitable"
!Value !! Code !! Implied probability
|-
|  1 || 0                || 1/2
|-
|  2 || 10 0             || 1/8
|-			    
|  3 || 11 0             || 1/8
|-			    
|  4 || 10 100 0         || 1/64
|-			    
|  5 || 10 101 0         || 1/64
|-			    
|  6 || 10 110 0         || 1/64
|-			    
|  7 || 10 111 0         || 1/64
|-			    
|  8 || 11 1000 0        || 1/128
|-			    
|  9 || 11 1001 0        || 1/128
|-			    
| 10 || 11 1010 0        || 1/128
|-			    
| 11 || 11 1011 0        || 1/128
|-			    
| 12 || 11 1100 0        || 1/128
|-			    
| 13 || 11 1101 0        || 1/128
|-			    
| 14 || 11 1110 0        || 1/128
|-			    
| 15 || 11 1111 0        || 1/128
|-			    
| 16 || 10 100 10000 0   || 1/2048
|-			    
| 17 || 10 100 10001 0   || 1/2048
|-
|colspan="3"|...
|-
| 100 || 10 110 1100100 0   || 1/8192
|-
| 1000 || 11 1001 1111101000 0   || 1/131,072
|-
| 10,000 || 11 1101 10011100010000 0 || 1/2,097,152
|-
| 100,000 || 10 100 10000 11000011010100000 0 || 1/268,435,456
|-
| 1,000,000 || {{nowrap|10 100 10011 11110100001001000000 0}} || 1/2,147,483,648
|}

The encoding for 1 [[googol]], 10<sup>100</sup>, is 11 1000 101001100 (15 bits of length header) followed by the 333-bit binary representation of 1 googol, which is 10010 01001001 10101101 00100101 10010100 11000011 01111100 11101011 00001011 00100111 10000100 11000100 11001110 00001011 11110011 10001010 11001110 01000000 10001110 00100001 00011010 01111100 10101010 10110010 01000011 00001000 10101000 00101110 10001111 00010000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 and a trailing 0, for a total of 349 bits.

A googol to the hundredth power (10<sup>10000</sup>) is a 33,220-bit binary number. Its omega encoding is 33,243 bits long: 11 1111 1000000111000100 (22 bits), followed by 33,220 bits of the value, and a trailing 0.  Under [[Elias delta coding]], the same number is 33,250 bits long: 000000000000000 1000000111000100 (31 bits) followed by 33,219 bits of the value. The omega and delta coding are, respectively, 0.07% and 0.09% longer than the ordinary 33,220-bit binary representation of the number.

== Code length ==
For the encoding of a positive integer {{math|''N''}}, the number of bits needed, {{math|''B''(''N'')}}, is recursively:<math display="block">
\begin{align}B(0) & = 0\,, \\
B(N) & = 1 + \lfloor \log_2(N) \rfloor + B(\lfloor \log_2(N) \rfloor)\,.
\end{align}</math>That is, the length of the Elias omega code for the integer <math>N</math> is<math display="block">1 + (1 + \lfloor \log_2 N \rfloor{}) + (1 + \lfloor \log_2  \lfloor \log_2 N \rfloor{} \rfloor{}) + \cdots </math>where the number of terms in the sum is bounded above by the [[Iterated logarithm|binary iterated logarithm]].
To be precise, let <math>f(x) = \lfloor \log_2 x \rfloor{} </math>. We have <math>N > f(N) > f(f(N)) > \cdots > f^k(N) = 1 </math> for some <math>k </math>, and the length of the code is <math>(k+1) + f(N) + f^2(N) + \dots + f^k(N) </math>. Since <math>f(x) \leq \log_2 x </math>, we have <math>k \leq \log_2^*(N)  </math>.

Since the iterated logarithm grows slower than all <math>\log_2^n N  </math> for any fixed <math>n</math>, the asymptotic growth rate is <math>\Theta(\log_2 N + \log_2^2 N + \cdots) </math>, where the sum terminates when it drops below one.

=== Asymptotic optimality ===
Elias omega coding is an asymptotically optimal prefix code.<ref>{{Cite journal |last=Elias |first=P. |date=March 1975 |title=Universal codeword sets and representations of the integers |url=https://ieeexplore.ieee.org/document/1055349 |journal=IEEE Transactions on Information Theory |language=en |volume=21 |issue=2 |pages=194–203 |doi=10.1109/TIT.1975.1055349 |issn=0018-9448|url-access=subscription }}</ref>

'''Proof sketch.''' A prefix code must satisfy the [[Kraft–McMillan inequality|Kraft inequality]]. For the Elias omega coding, the Kraft inequality states<math display="block">\sum_{n=1}^\infty \frac{1}{2^{O(1) + \log_2 n + \log_2\log_2 n + \cdots}} \leq 1 </math>Now, the summation is asymptotically the same as an integral, giving us<math display="block">\int_1^\infty \frac{dx}{x \times \ln x \times \ln \ln x \cdots} \leq O(1)  </math>If the denominator terminates at some point <math>\ln^k x</math>, then the integral diverges as <math>\ln^{k+1} \infty</math>. However, if the denominator terminates at some point <math>(\ln^k x)^{1+\epsilon}</math>, then the integral converges as <math>\frac{1}{(\ln^{k} \infty)^\epsilon} </math>. The Elias omega code is on the edge between diverging and converging.

== Example code ==

=== Encoding ===
<syntaxhighlight lang="cpp" line="1">// elias_omega_encode is written as a template function taking in an integer
// type parameter, because Integer can be a big integer class, and Elias omega
// encoding will still encode it extremely efficiently (10^10000 only uses 23
// bits more than the binary representation of 10^10000)

#include <vector>

template <class Integer>
constexpr std::vector<bool> little_endian_binary(Integer num){
    std::vector<bool> bits{};
    while (num != 0){
        bits.push_back(num & 0b1u);
        num >>= 1;
    }
    return bits;
}

constexpr std::vector<bool> concat(std::vector<bool> l, std::vector<bool> r){
    for (bool const i : r){
        l.push_back(i);
    }
    return l;
}

////////////////////////////////////////////////////////////////////////////////
// TEMPLATE PARAMETERS                                                        //
// Integer: Type of num                                                       //
////////////////////////////////////////////////////////////////////////////////
// PARAMETERS                                                                 //
// num: Integer storing a positive integer to convert to Elias omega encoding //
////////////////////////////////////////////////////////////////////////////////
// RETURNS                                                                    //
// std::vector<bool> storing Elias omega encoding of num                      //
////////////////////////////////////////////////////////////////////////////////
// EXCEPTIONS                                                                 //
// std::range_error if num is <= 0, as num must be >= 1                       //
// std::bad_alloc if memory runs out                                          //
// Any exception thrown by Integer bitwise or conversion operators            //
////////////////////////////////////////////////////////////////////////////////
template <class Integer>
constexpr std::vector<bool> elias_omega_encode(Integer num){
    if (num <= 0){
        throw std::range_error("elias_omega_encode expects a value >= 1");
    }
    std::vector<bool> bits = {false};
    while (num != 1){
        std::vector<bool> le_binary = little_endian_binary(std::move(num));
        auto const sizem1 = le_binary.size() - 1;
        bits = concat(std::move(le_binary), std::move(bits));
        num = static_cast<Integer>(sizem1);
    }
    return bits;
}</syntaxhighlight>

=== Decoding ===
<syntaxhighlight lang="cpp" line="1">// elias_omega_decode is written as a template function taking in an integer
// type parameter, because Integer can be a big integer class, and Elias omega
// encoding will still encode it extremely efficiently (10^10000 only uses 23
// bits more than the binary representation of 10^10000)

#include <type_traits>
#include <vector>

////////////////////////////////////////////////////////////////////////////////
// TEMPLATE PARAMETERS                                                        //
// Integer: Type of integer to convert from Elias omega encoding              //
// GetNextBit: Type of get_next_bit                                           //
////////////////////////////////////////////////////////////////////////////////
// PARAMETERS                                                                 //
// get_next_bit: Function yielding a bool from some source if given no        //
// arguments, should throw an exception if this cannot be done                //
////////////////////////////////////////////////////////////////////////////////
// RETURNS                                                                    //
// Integer decoded from the stream of return values provided by get_next_bit  //
////////////////////////////////////////////////////////////////////////////////
// EXCEPTIONS                                                                 //
// std::bad_alloc if memory runs out                                          //
// Any exception thrown by operator() of GetNextBit                           //
// Any exception thrown by Integer bitwise or conversion operators            //
////////////////////////////////////////////////////////////////////////////////
template <class Integer, class GetNextBit>
requires std::is_invocable_r_v<bool, GetNextBit&>
constexpr Integer elias_omega_decode(GetNextBit get_next_bit){
    Integer result = static_cast<Integer>(1);
    while (std::invoke_r<bool>(get_next_bit)){
        Integer new_result = static_cast<Integer>(1);
        while (result != 0){
            new_result <<= 1;
            if (std::invoke_r<bool>(get_next_bit)){
                new_result |= 0b1u;
            }
            --result;
        }
        result = std::move(new_result);
    }
    return result;
}</syntaxhighlight>

== Generalizations ==
{{See also|Variable-length quantity#Zigzag encoding}}
<!-- This section is linked from [[Elias gamma coding]] -->
Elias omega coding does not encode zero or negative integers.
One way to encode all non-negative integers is to add 1 before encoding and then subtract 1 after decoding, or use the very similar [[Levenshtein coding]].
One way to encode all integers is to set up a [[bijection]], mapping all integers (0, 1, -1, 2, -2, 3, -3, ...) to strictly positive integers (1, 2, 3, 4, 5, 6, 7, ...) before encoding.

== See also ==
* [[Elias delta coding|Elias delta (δ) coding]]
* [[Elias gamma coding|Elias gamma (γ) coding]]
* [[Even-Rodeh coding]]

==References==
{{Reflist}}

==Further reading==
* {{cite journal |author-first=Peter |author-last=Elias |author-link=Peter Elias |title=Universal codeword sets and representations of the integers |journal=[[IEEE Transactions on Information Theory]] |volume=21 |issue=2 |pages=194–203 |date=March 1975 |doi=10.1109/tit.1975.1055349}}
* {{cite encyclopedia |author-last=Fenwick |author-first=Peter |chapter=Universal Codes |title=Lossless Compression Handbook |editor-last=Sayood |editor-first=Khalid |isbn=978-0123907547 |doi=10.1016/B978-012620861-0/50004-8 |date=2003 |pages=55–78 |publisher=[[Academic Press]] |location=New York, NY, USA}}

==External links==
* [https://gist.github.com/483003 Implementation in Python]

{{Compression Methods}}

{{DEFAULTSORT:Elias Omega Coding}}
[[Category:Entropy coding]]
[[Category:Numeral systems]]
[[Category:Data compression]]