Editing LZMA (section)

===Range coding of bits===

LZMA data is at the lowest level decoded one bit at a time by the range decoder, at the direction of the LZMA decoder.

Context-based range decoding is invoked by the LZMA algorithm passing it a reference to the "context", which consists of the unsigned 11-bit variable ''prob'' (typically implemented using a 16-bit data type) representing the predicted probability of the bit being 0, which is read and updated by the range decoder (and should be initialized to {{tmath|2^{10} }}, representing 0.5 probability).

Fixed probability range decoding instead assumes a 0.5 probability, but operates slightly differently from context-based range decoding.

The range decoder state consists of two unsigned 32-bit variables, ''range'' (representing the range size), and ''code'' (representing the encoded point within the range).

Initialization of the range decoder consists of setting ''range'' to {{math|2<sup>32</sup> − 1}}, and ''code'' to the 32-bit value starting at the second byte in the stream interpreted as big-endian; the first byte in the stream is completely ignored.

Normalization proceeds in this way:
# Shift both ''range'' and ''code'' left by 8 bits
# Read a byte from the compressed stream
# Set the least significant 8 bits of ''code'' to the byte value read

Context-based range decoding of a bit using the ''prob'' probability variable proceeds in this way:
# If ''range'' is less than {{tmath|2^{24} }}, perform normalization
# Set ''bound'' to {{tmath|\lfloor range / 2^{11} \rfloor \times prob}}
# If ''code'' is less than ''bound'':
## Set ''range'' to ''bound''
## Set ''prob'' to ''prob'' + {{tmath|\lfloor 2^{11} - prob \rfloor / 2^5}}
## Return bit 0
# Otherwise (if ''code'' is greater than or equal to the ''bound''):
## Set ''range'' to ''range'' &minus; ''bound''
## Set ''code'' to ''code'' &minus; ''bound''
## Set ''prob'' to {{tmath|prob - \lfloor prob / 2^5 \rfloor }}
## Return bit 1

Fixed-probability range decoding of a bit proceeds in this way:
# If ''range'' is less than {{tmath|2^{24} }}, perform normalization
# Set ''range'' to {{tmath|\lfloor range / 2 \rfloor}}
# If ''code'' is less than ''range'':
## Return bit 0
# Otherwise (if ''code'' is greater or equal than ''range''):
## Set ''code'' to ''code'' &minus; ''range''
## Return bit 1

The Linux kernel implementation of fixed-probability decoding in <code>rc_direct()</code>, for performance reasons, does not include a conditional branch, but instead subtracts ''range'' from ''code'' unconditionally. The resulting sign bit is used to both decide the bit to return and to generate a mask that is combined with ''code'' and added to ''range''.

Note that:
# The division by {{tmath|2^{11} }} when computing ''bound'' and floor operation is done before the multiplication, not after (apparently to avoid requiring fast hardware support for 32-bit multiplication with a 64-bit result)
# Fixed probability decoding is not strictly equivalent to context-based range decoding with any ''prob'' value, due to the fact that context-based range decoding discards the lower 11 bits of ''range'' before multiplying by ''prob'' as just described, while fixed probability decoding only discards the last bit