Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Typical set
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==(Weakly) typical sequences (weak typicality, entropy typicality)== If a sequence ''x''<sub>1</sub>, ..., ''x''<sub>''n''</sub> is drawn from an [[Independent identically-distributed random variables|independent identically-distributed random variable]] (IID) ''X'' defined over a finite alphabet <math>\mathcal{X}</math>, then the typical set, ''A''<sub>''Ξ΅''</sub><sup>(''n'')</sup><math>\in\mathcal{X}</math><sup>(''n'')</sup> is defined as those sequences which satisfy: :<math> 2^{-n( H(X)+\varepsilon)} \leqslant p(x_1, x_2, \dots , x_n) \leqslant 2^{-n( H(X)-\varepsilon)} </math> where : <math> H(X) = - \sum_{x \isin \mathcal{X}}p(x)\log_2 p(x) </math> is the information entropy of ''X''. The probability above need only be within a factor of 2<sup>''n'' ''Ξ΅''</sup>. Taking the logarithm on all sides and dividing by ''-n'', this definition can be equivalently stated as :<math> H(X) - \varepsilon \leq -\frac{1}{n}\log_2 p(x_1, x_2, \ldots, x_n) \leq H(X) + \varepsilon.</math> For i.i.d sequence, since :<math>p(x_1, x_2, \ldots, x_n) = \prod_{i=1}^n p(x_i),</math> we further have :<math> H(X) - \varepsilon \leq -\frac{1}{n} \sum_{i=1}^n \log_2 p(x_i) \leq H(X) + \varepsilon.</math> By the law of large numbers, for sufficiently large ''n'' :<math>-\frac{1}{n} \sum_{i=1}^n \log_2 p(x_i) \rightarrow H(X).</math> ===Properties=== An essential characteristic of the typical set is that, if one draws a large number ''n'' of independent random samples from the distribution ''X'', the resulting sequence (''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>) is very likely to be a member of the typical set, even though the typical set comprises only a small fraction of all the possible sequences. Formally, given any <math>\varepsilon>0</math>, one can choose ''n'' such that: #The probability of a sequence from ''X''<sup>(n)</sup> being drawn from ''A''<sub>''Ξ΅''</sub><sup>(''n'')</sup> is greater than 1 − ''Ξ΅'', i.e. <math>Pr[x^{(n)} \in A_\epsilon^{(n)}] \geq 1 - \varepsilon </math> #<math>\left| {A_\varepsilon}^{(n)} \right| \leqslant 2^{n(H(X)+\varepsilon)}</math> #<math>\left| {A_\varepsilon}^{(n)} \right| \geqslant (1-\varepsilon)2^{n(H(X)-\varepsilon)}</math> #If the distribution over <math>\mathcal{X}</math> is not uniform, then the fraction of sequences that are typical is ::<math>\frac{|A_\epsilon^{(n)}|}{|\mathcal{X}^{(n)}|} \equiv \frac{2^{nH(X)}}{2^{n\log_2|\mathcal{X}|}} = 2^{-n(\log_2|\mathcal{X}|-H(X))} \rightarrow 0 </math> ::as ''n'' becomes very large, since <math>H(X) < \log_2|\mathcal{X}|,</math> where <math>|\mathcal{X}|</math> is the [[cardinality]] of <math>\mathcal{X}</math>. For a general stochastic process {''X''(''t'')} with AEP, the (weakly) typical set can be defined similarly with ''p''(''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>) replaced by ''p''(''x''<sub>0</sub><sup>''Ο''</sup>) (i.e. the probability of the sample limited to the time interval [0, ''Ο'']), ''n'' being the [[degrees of freedom (physics and chemistry)|degree of freedom]] of the process in the time interval and ''H''(''X'') being the [[entropy rate]]. If the process is continuous valued, [[differential entropy]] is used instead. ===Example=== Counter-intuitively, the most likely sequence is often not a member of the typical set. For example, suppose that ''X'' is an i.i.d [[Bernoulli_distribution|Bernoulli random variable]] with ''p''(0)=0.1 and ''p''(1)=0.9. In ''n'' independent trials, since ''p''(1)>''p''(0), the most likely sequence of outcome is the sequence of all 1's, (1,1,...,1). Here the entropy of ''X'' is ''H''(''X'')=0.469, while :<math> -\frac{1}{n}\log_2 p\left(x^{(n)}=(1,1,\ldots,1)\right) = -\frac{1}{n}\log_2 (0.9^n) = 0.152</math> So this sequence is not in the typical set because its average logarithmic probability cannot come arbitrarily close to the entropy of the random variable ''X'' no matter how large we take the value of ''n''. For Bernoulli random variables, the typical set consists of sequences with average numbers of 0s and 1s in ''n'' independent trials. This is easily demonstrated: If ''p(1) = p'' and ''p(0) = 1-p'', then for ''n'' trials with ''m'' 1's, we have :<math> -\frac{1}{n} \log_2 p(x^{(n)}) = -\frac{1}{n} \log_2 p^m (1-p)^{n-m} = -\frac{m}{n} \log_2 p - \left( \frac{n-m}{n} \right) \log_2 (1-p).</math> The average number of 1's in a sequence of Bernoulli trials is ''m = np''. Thus, we have :<math> -\frac{1}{n} \log_2 p(x^{(n)}) = - p \log_2 p - (1-p) \log_2 (1-p) = H(X).</math> For this example, if ''n''=10, then the typical set consist of all sequences that have a single 0 in the entire sequence. In case ''p''(0)=''p''(1)=0.5, then every possible binary sequences belong to the typical set.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)