Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Shannon's source coding theorem
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Statements == ''Source coding'' is a mapping from (a sequence of) symbols from an information [[Information theory#Source theory|source]] to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is one approach to [[data compression]]. === Source coding theorem === In information theory, the source coding theorem (Shannon 1948)<ref name="Shannon"/> informally states that (MacKay 2003, pg. 81,<ref name="MacKay"/> Cover 2006, Chapter 5<ref name="Cover"/>): <blockquote>{{mvar|N}} [[Independent and identically distributed random variables|i.i.d.]] random variables each with entropy {{math|''H''(''X'')}} can be compressed into more than {{math|''N H''(''X'')}} [[bit]]s with negligible risk of information loss, as {{math|''N'' β β}}; but conversely, if they are compressed into fewer than {{math|''N H''(''X'')}} bits it is virtually certain that information will be lost.</blockquote>The <math>NH(X)</math> coded sequence represents the compressed message in a biunivocal way, under the assumption that the decoder knows the source. From a practical point of view, this hypothesis is not always true. Consequently, when the entropy encoding is applied the transmitted message is <math>NH(X)+(inf. source)</math>. Usually, the information that characterizes the source is inserted at the beginning of the transmitted message. === Source coding theorem for symbol codes === Let {{math|Ξ£<sub>1</sub>, Ξ£<sub>2</sub>}} denote two finite alphabets and let {{math|Ξ£{{su|b=1|p=β}}}} and {{math|Ξ£{{su|b=2|p=β}}}} denote the [[Kleene star|set of all finite words]] from those alphabets (respectively). Suppose that {{mvar|X}} is a random variable taking values in {{math|Ξ£<sub>1</sub>}} and let {{math| ''f'' }} be a [[Variable-length code#Uniquely decodable codes|uniquely decodable]] code from {{math|Ξ£{{su|b=1|p=β}}}} to {{math|Ξ£{{su|b=2|p=β}}}} where {{math|{{!}}Ξ£<sub>2</sub>{{!}} {{=}} ''a''}}. Let {{mvar|S}} denote the random variable given by the length of codeword {{math| ''f'' (''X'')}}. If {{math| ''f'' }} is optimal in the sense that it has the minimal expected word length for {{mvar|X}}, then (Shannon 1948): :<math> \frac{H(X)}{\log_2 a} \leq \mathbb{E}[S] < \frac{H(X)}{\log_2 a} +1 </math> Where <math>\mathbb{E}</math> denotes the [[expected value]] operator.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)