Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
LZMA
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Compressed format overview== In LZMA compression, the compressed stream is a stream of bits, encoded using an adaptive binary range coder. The stream is divided into packets, each packet describing either a single byte, or an LZ77 sequence with its length and distance implicitly or explicitly encoded. Each part of each packet is modeled with independent contexts, so the probability predictions for each bit are correlated with the values of that bit (and related bits from the same field) in previous packets of the same type. Both the lzip<ref name=lzip-format>{{cite web |title=Lzip Stream Format |url=https://www.nongnu.org/lzip/manual/lzip_manual.html#Stream-format |website=Lzip Manual |access-date=14 November 2019}}</ref> and the LZMA SDK documentation describe this stream format.<ref name=lzma-sdk-format/> There are 7 types of packets:<ref name=lzip-format/> {{table alignment}} {| class="wikitable col1right" |- ! Packed code (bit sequence) ! Packet name ! Packet description |- | 0 + byteCode | LIT | A single byte encoded using an adaptive binary range coder. |- | 1+0 + len + dist | MATCH | A typical LZ77 sequence describing sequence length and distance. |- | 1+1+0+0 | SHORTREP | A one-byte LZ77 sequence. Distance is equal to the last used LZ77 distance. |- | 1+1+0+1 + len | LONGREP[0] | An LZ77 sequence. Distance is equal to the last used LZ77 distance. |- | 1+1+1+0 + len | LONGREP[1] | An LZ77 sequence. Distance is equal to the second last used LZ77 distance. |- | 1+1+1+1+0 + len | LONGREP[2] | An LZ77 sequence. Distance is equal to the third last used LZ77 distance. |- | 1+1+1+1+1 + len | LONGREP[3] | An LZ77 sequence. Distance is equal to the fourth last used LZ77 distance. |} LONGREP[*] refers to LONGREP[0β3] packets, *REP refers to both LONGREP and SHORTREP, and *MATCH refers to both MATCH and *REP. LONGREP[n] packets remove the distance used from the list of the most recent distances and reinsert it at the front, to avoid useless repeated entry, while MATCH just adds the distance to the front even if already present in the list and SHORTREP and LONGREP[0] don't alter the list. The length is encoded as follows: {| class="wikitable col1right" |- ! Length code (bit sequence) ! Description |- | 0+ 3 bits | The length encoded using 3 bits, gives the lengths range from 2 to 9. |- | 1+0+ 3 bits | The length encoded using 3 bits, gives the lengths range from 10 to 17. |- | 1+1+ 8 bits | The length encoded using 8 bits, gives the lengths range from 18 to 273. |} As in LZ77, the length is not limited by the distance, because copying from the dictionary is defined as if the copy was performed byte by byte, keeping the distance constant. Distances are logically 32-bit and distance 0 points to the most recently added byte in the dictionary. The distance encoding starts with a 6-bit "distance slot", which determines how many further bits are needed. Distances are decoded as a binary concatenation of, from most to least significant, two bits depending on the distance slot, some bits encoded with fixed 0.5 probability, and some context encoded bits, according to the following table (distance slots 0β3 directly encode distances 0β3). {| class="wikitable defaultright" |+ Distance encoding<ref name=lzma-sdk-format>{{cite web |title=LZMA Specification.7z in LZMA SDK |url=https://www.7-zip.org/sdk.html |website=7-zip.org}}</ref> |- ! 6-bit distance slot ! Highest 2 bits ! Fixed 0.5 probability bits ! Context encoded bits |- | 0 | 00 | 0 | 0 |- | 1 | 01 | 0 | 0 |- | 2 | 10 | 0 | 0 |- | 3 | 11 | 0 | 0 |- | 4 | 10 | 0 | 1 |- | 5 | 11 | 0 | 1 |- | 6 | 10 | 0 | 2 |- | 7 | 11 | 0 | 2 |- | 8 | 10 | 0 | 3 |- | 9 | 11 | 0 | 3 |- | 10 | 10 | 0 | 4 |- | 11 | 11 | 0 | 4 |- | 12 | 10 | 0 | 5 |- | 13 | 11 | 0 | 5 |- | 14β62 (even) | 10 | slot / 2 β 5 | 4 |- | 15β63 (odd) | 11 | (slot β 1) / 2 β 5 |4 |}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)