Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Shannon's source coding theorem
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{short description|Establishes the limits to possible data compression}} {{Information theory}} {{about|the theory of source coding in data compression|the term in computer programming|Source code}} In [[information theory]], '''Shannon's source coding theorem''' (or '''noiseless coding theorem''') establishes the statistical limits to possible [[data compression]] for data whose source is an [[independent identically-distributed random variables|independent identically-distributed random variable]], and the operational meaning of the [[Shannon entropy]]. Named after [[Claude Shannon]], the '''source coding theorem''' shows that, in the limit, as the length of a stream of [[independent and identically distributed random variables|independent and identically-distributed random variable (i.i.d.)]] data tends to infinity, it is impossible to compress such data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss. The '''source coding theorem for symbol codes''' places an upper and a lower bound on the minimal possible expected length of codewords as a function of the [[Entropy (information theory)|entropy]] of the input word (which is viewed as a [[random variable]]) and of the size of the target alphabet. Note that, for data that exhibits more dependencies (whose source is not an i.i.d. random variable), the [[Kolmogorov complexity]], which quantifies the minimal description length of an object, is more suitable to describe the limits of data compression. Shannon entropy takes into account only frequency regularities while Kolmogorov complexity takes into account all algorithmic regularities, so in general the latter is smaller. On the other hand, if an object is generated by a random process in such a way that it has only frequency regularities, entropy is close to complexity with high probability (Shen et al. 2017).<ref name="Shen2017"/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)