Editing Coding theory (section)

==Source coding==
{{main|Data compression}}

The aim of source coding is to take the source data and make it smaller.

===Definition===
Data can be seen as a [[random variable]] <math>X:\Omega\to\mathcal{X}</math>, where <math>x \in \mathcal{X}</math> appears with probability <math>\mathbb{P}[X=x]</math>.

Data are encoded by strings (words) over an [[Alphabet (computer science)|alphabet]] <math>\Sigma</math>.

A code is a function

:<math>C:\mathcal{X}\to\Sigma^*</math> (or <math>\Sigma^+</math> if the empty string is not part of the alphabet).

<math>C(x)</math> is the code word associated with <math>x</math>.

Length of the code word is written as

:<math>l(C(x)).</math>

Expected length of a code is

:<math>l(C) = \sum_{x\in\mathcal{X}}l(C(x))\mathbb{P}[X=x] .</math>

The concatenation of code words <math>C(x_1, \ldots, x_k) = C(x_1)C(x_2) \cdots C(x_k)</math>.

The code word of the empty string is the empty string itself:

:<math>C(\epsilon) = \epsilon</math>

===Properties===
# <math>C:\mathcal{X}\to\Sigma^*</math> is [[Variable-length code#Non-singular codes|non-singular]] if [[Injective function|injective]].
# <math>C:\mathcal{X}^*\to\Sigma^*</math> is [[Uniquely decodable code#Uniquely decodable codes|uniquely decodable]] if injective.
# <math>C:\mathcal{X}\to\Sigma^*</math> is [[Variable-length code#Prefix codes|instantaneous]] if <math>C(x_1)</math> is not a proper prefix of <math>C(x_2)</math> (and vice versa).

===Principle===
[[Entropy (information theory)|Entropy]] of a source is the measure of information. Basically, source codes try to reduce the redundancy present in the source, and represent the source with fewer bits that carry more information.

Data compression which explicitly tries to minimize the average length of messages according to a particular assumed probability model is called [[entropy encoding]].

Various techniques used by source coding schemes try to achieve the limit of entropy of the source. ''C''(''x'') ≥ ''H''(''x''), where ''H''(''x'') is entropy of source (bitrate), and  ''C''(''x'') is the bitrate after compression. In particular, no source coding scheme can be better than the entropy of the source.

===Example===
[[FAX|Facsimile]] transmission uses a simple [[Run-length encoding|run length code]]. Source coding removes all data superfluous to the need of the transmitter, decreasing the bandwidth required for transmission.