Editing Arithmetic coding (section)

{{short description|Form of entropy encoding used in data compression}}
{{Use dmy dates|date=December 2019}}
{{more footnotes needed|date=September 2016}}
'''Arithmetic coding''' ('''AC''') is a form of [[entropy encoding]] used in [[lossless data compression]]. Normally, a [[String (computer science)|string of characters]] is represented using a fixed number of [[bit]]s per character, as in the [[American Standard Code for Information Interchange|ASCII]] code. When a string is converted to arithmetic encoding, frequently used characters will be stored with fewer bits and not-so-frequently occurring characters will be stored with more bits, resulting in fewer bits used in total. Arithmetic coding differs from other forms of entropy encoding, such as [[Huffman coding]], in that rather than separating the input into component symbols and replacing each with a code, arithmetic coding encodes the entire message into a single number, an [[arbitrary-precision arithmetic|arbitrary-precision]] fraction ''q'', where {{nowrap|0.0 ≤ ''q'' &lt; 1.0}}. It represents the current information as a range, defined by two numbers.<ref name="LiDrew2014">{{cite book|author1=Ze-Nian Li|author2=Mark S. Drew|author3=Jiangchuan Liu|title=Fundamentals of Multimedia|url=https://books.google.com/books?id=R6vBBAAAQBAJ|date=9 April 2014|publisher=Springer Science & Business Media|isbn=978-3-319-05290-8}}</ref> A recent family of entropy coders called [[asymmetric numeral systems]] allows for faster implementations thanks to directly operating on a single natural number representing the current information.<ref name=PCS2015>[https://ieeexplore.ieee.org/document/7170048/ J. Duda, K. Tahboub, N. J. Gadil, E. J. Delp, ''The use of asymmetric numeral systems as an accurate replacement for Huffman coding''], Picture Coding Symposium, 2015.</ref>

[[File:Arithmetic coding example.svg|400px|thumb|right|An arithmetic coding example assuming a fixed probability distribution of three symbols "A", "B", and "C". Probability of "A" is 50%, probability of "B" is 33% and probability of "C" is 17%. Furthermore, we assume that the recursion depth is known in each step. In step one we code "B" which is inside the interval &#91;0.5,&nbsp;0.83): The binary number "0.10''x''" is the shortest code that represents an interval that is entirely inside &#91;0.5,&nbsp;0.83). "''x''" means an arbitrary bit sequence. There are two extreme cases: the smallest ''x'' stands for zero which represents the left side of the represented interval. Then the left side of the interval is dec(0.10)&nbsp;=&nbsp;0.5. At the other extreme, ''x'' stands for a finite sequence of ones which has the upper limit dec(0.11)&nbsp;=&nbsp;0.75. Therefore, "0.10''x''" represents the interval &#91;0.5,&nbsp;0.75) which is inside &#91;0.5,&nbsp;0.83). Now we can leave out the "0." part since all intervals begin with&nbsp;"0." and we can ignore the "''x''" part because no matter what bit-sequence it represents, we will stay inside &#91;0.5,&nbsp;0.75).]]