Editing Entropy (information theory) (section)

==Example==
[[File:Binary entropy plot.svg|thumbnail|right|200px|Entropy {{math|Η(''X'')}} (i.e. the [[expected value|expected]] [[surprisal]]) of a coin flip, measured in bits, graphed versus the bias of the coin {{math|1=Pr(''X'' = 1)}}, where {{math|1=''X'' = 1}} represents a result of heads.<ref name=cover1991/>{{rp|p=14–15}}<br /><br />Here, the entropy is at most 1 bit, and to communicate the outcome of a coin flip (2 possible values) will require an average of at most 1 bit (exactly 1 bit for a fair coin). The result of a fair die (6 possible values) would have entropy log<sub>2</sub>6 bits.]]
{{Main|Binary entropy function|Bernoulli process}}
Consider tossing a coin with known, not necessarily fair, probabilities of coming up heads or tails; this can be modeled as a [[Bernoulli process]].

The entropy of the unknown result of the next toss of the coin is maximized if the coin is fair (that is, if heads and tails both have equal probability 1/2). This is the situation of maximum uncertainty as it is most difficult to predict the outcome of the next toss; the result of each toss of the coin delivers one full bit of information. This is because
<math display="block">\begin{align}
\Eta(X) &= -\sum_{i=1}^n {p(x_i) \log_b p(x_i)} 
\\ &= -\sum_{i=1}^2 {\frac{1}{2}\log_2{\frac{1}{2}}} 
\\ &= -\sum_{i=1}^2 {\frac{1}{2} \cdot (-1)} = 1.
\end{align}</math>

However, if we know the coin is not fair, but comes up heads or tails with probabilities {{math|''p''}} and {{math|''q''}}, where {{math|''p'' ≠ ''q''}}, then there is less uncertainty. Every time it is tossed, one side is more likely to come up than the other. The reduced uncertainty is quantified in a lower entropy: on average each toss of the coin delivers less than one full bit of information. For example, if {{math|''p''}} = 0.7, then
<math display="block">\begin{align}
\Eta(X) &=  - p \log_2 p - q \log_2 q
\\[1ex] &= - 0.7 \log_2 (0.7) - 0.3 \log_2 (0.3) 
\\[1ex] &\approx - 0.7 \cdot (-0.515) - 0.3 \cdot (-1.737) 
\\[1ex] &= 0.8816 < 1.
\end{align}</math>

Uniform probability yields maximum uncertainty and therefore maximum entropy. Entropy, then, can only decrease from the value associated with uniform probability. The extreme case is that of a double-headed coin that never comes up tails, or a double-tailed coin that never results in a head. Then there is no uncertainty. The entropy is zero: each toss of the coin delivers no new information as the outcome of each coin toss is always certain.<ref name=cover1991/>{{rp|p=14–15}}