Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
LZMA
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Range coding of bits=== LZMA data is at the lowest level decoded one bit at a time by the range decoder, at the direction of the LZMA decoder. Context-based range decoding is invoked by the LZMA algorithm passing it a reference to the "context", which consists of the unsigned 11-bit variable ''prob'' (typically implemented using a 16-bit data type) representing the predicted probability of the bit being 0, which is read and updated by the range decoder (and should be initialized to {{tmath|2^{10} }}, representing 0.5 probability). Fixed probability range decoding instead assumes a 0.5 probability, but operates slightly differently from context-based range decoding. The range decoder state consists of two unsigned 32-bit variables, ''range'' (representing the range size), and ''code'' (representing the encoded point within the range). Initialization of the range decoder consists of setting ''range'' to {{math|2<sup>32</sup> β 1}}, and ''code'' to the 32-bit value starting at the second byte in the stream interpreted as big-endian; the first byte in the stream is completely ignored. Normalization proceeds in this way: # Shift both ''range'' and ''code'' left by 8 bits # Read a byte from the compressed stream # Set the least significant 8 bits of ''code'' to the byte value read Context-based range decoding of a bit using the ''prob'' probability variable proceeds in this way: # If ''range'' is less than {{tmath|2^{24} }}, perform normalization # Set ''bound'' to {{tmath|\lfloor range / 2^{11} \rfloor \times prob}} # If ''code'' is less than ''bound'': ## Set ''range'' to ''bound'' ## Set ''prob'' to ''prob'' + {{tmath|\lfloor 2^{11} - prob \rfloor / 2^5}} ## Return bit 0 # Otherwise (if ''code'' is greater than or equal to the ''bound''): ## Set ''range'' to ''range'' − ''bound'' ## Set ''code'' to ''code'' − ''bound'' ## Set ''prob'' to {{tmath|prob - \lfloor prob / 2^5 \rfloor }} ## Return bit 1 Fixed-probability range decoding of a bit proceeds in this way: # If ''range'' is less than {{tmath|2^{24} }}, perform normalization # Set ''range'' to {{tmath|\lfloor range / 2 \rfloor}} # If ''code'' is less than ''range'': ## Return bit 0 # Otherwise (if ''code'' is greater or equal than ''range''): ## Set ''code'' to ''code'' − ''range'' ## Return bit 1 The Linux kernel implementation of fixed-probability decoding in <code>rc_direct()</code>, for performance reasons, does not include a conditional branch, but instead subtracts ''range'' from ''code'' unconditionally. The resulting sign bit is used to both decide the bit to return and to generate a mask that is combined with ''code'' and added to ''range''. Note that: # The division by {{tmath|2^{11} }} when computing ''bound'' and floor operation is done before the multiplication, not after (apparently to avoid requiring fast hardware support for 32-bit multiplication with a 64-bit result) # Fixed probability decoding is not strictly equivalent to context-based range decoding with any ''prob'' value, due to the fact that context-based range decoding discards the lower 11 bits of ''range'' before multiplying by ''prob'' as just described, while fixed probability decoding only discards the last bit
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)