Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Shannon–Fano coding
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Example=== This example shows the construction of a Shannon–Fano code for a small alphabet. There 5 different source symbols. Suppose 39 total symbols have been observed with the following frequencies, from which we can estimate the symbol probabilities. :{| class="wikitable" style="text-align: center;" ! Symbol ! A ! B ! C ! D ! E |- ! Count | 15 | 7 | 6 | 6 | 5 |- ! Probabilities | 0.385 | 0.179 | 0.154 | 0.154 | 0.128 |} This source has [[Entropy (information theory)|entropy]] <math>H(X) = 2.186</math> bits. For the Shannon–Fano code, we need to calculate the desired word lengths <math>l_i = \lceil -\log_2 p_i \rceil</math>. :{| class="wikitable" style="text-align: center;" ! Symbol ! A ! B ! C ! D ! E |- ! Probabilities | 0.385 | 0.179 | 0.154 | 0.154 | 0.128 |- ! <math>-\log_2 p_i</math> | 1.379 | 2.480 | 2.700 | 2.700 | 2.963 |- ! Word lengths <math>\lceil -\log_2 p_i \rceil</math> | 2 | 3 | 3 | 3 | 3 |} We can pick codewords in order, choosing the lexicographically first word of the correct length that maintains the prefix-free property. Clearly A gets the codeword 00. To maintain the prefix-free property, B's codeword may not start 00, so the lexicographically first available word of length 3 is 010. Continuing like this, we get the following code: :{| class="wikitable" style="text-align: center;" ! Symbol ! A ! B ! C ! D ! E |- ! Probabilities | 0.385 | 0.179 | 0.154 | 0.154 | 0.128 |- ! Word lengths <math>\lceil -\log_2 p_i \rceil</math> | 2 | 3 | 3 | 3 | 3 |- ! Codewords | 00 | 010 | 011 | 100 | 101 |} Alternatively, we can use the cumulative probability method. :{| class="wikitable" style="text-align: center;" ! Symbol ! A ! B ! C ! D ! E |- ! Probabilities | 0.385 | 0.179 | 0.154 | 0.154 | 0.128 |- ! Cumulative probabilities | 0.000 | 0.385 | 0.564 | 0.718 | 0.872 |- ! ...in binary | 0.00000 | 0.01100 | 0.10010 | 0.10110 | 0.11011 |- ! Word lengths <math>\lceil -\log_2 p_i \rceil</math> | 2 | 3 | 3 | 3 | 3 |- ! Codewords | 00 | 011 | 100 | 101 | 110 |} Note that although the codewords under the two methods are different, the word lengths are the same. We have lengths of 2 bits for A, and 3 bits for B, C, D and E, giving an average length of :<math display="block">\frac{2\,\text{bits}\cdot(15) + 3\,\text{bits} \cdot (7+6+6+5)}{39\, \text{symbols}} \approx 2.62\,\text{bits per symbol,}</math> which is within one bit of the entropy.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)