Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Information content
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Properties == {{Expand section|date=October 2018}} === Monotonically decreasing function of probability === For a given [[probability space]], the measurement of rarer [[event (probability theory)|event]]s are intuitively more "surprising", and yield more information content, than more common values. Thus, self-information is a [[Monotonic function|strictly decreasing monotonic function]] of the probability, or sometimes called an "antitonic" function. While standard probabilities are represented by real numbers in the interval <math>[0, 1]</math>, self-informations are represented by [[extended real number]]s in the interval <math>[0, \infty]</math>. In particular, we have the following, for any choice of logarithmic base: * If a particular event has a 100% probability of occurring, then its self-information is <math>-\log(1) = 0</math>: its occurrence is "perfectly non-surprising" and yields no information. * If a particular event has a 0% probability of occurring, then its self-information is <math>-\log(0) = \infty</math>: its occurrence is "infinitely surprising". From this, we can get a few general properties: * Intuitively, more information is gained from observing an unexpected event—it is "surprising". ** For example, if there is a [[wikt:one in a million|one-in-a-million]] chance of Alice winning the [[lottery]], her friend Bob will gain significantly more information from learning that she [[Winning the lottery|won]] than that she lost on a given day. (See also ''[[Lottery mathematics]]''.) * This establishes an implicit relationship between the self-information of a [[random variable]] and its [[variance]]. === Relationship to log-odds === The Shannon information is closely related to the [[log-odds]]. In particular, given some event <math>x</math>, suppose that <math>p(x)</math> is the probability of <math>x</math> occurring, and that <math>p(\lnot x) = 1-p(x)</math> is the probability of <math>x</math> not occurring. Then we have the following definition of the log-odds: <math display="block">\text{log-odds}(x) = \log\left(\frac{p(x)}{p(\lnot x)}\right)</math> This can be expressed as a difference of two Shannon informations: <math display="block">\text{log-odds}(x) = \mathrm{I}(\lnot x) - \mathrm{I}(x)</math> In other words, the log-odds can be interpreted as the level of surprise when the event ''doesn't'' happen, minus the level of surprise when the event ''does'' happen. === Additivity of independent events === The information content of two [[independent events]] is the sum of each event's information content. This property is known as [[Additive map|additivity]] in mathematics, and [[sigma additivity]] in particular in [[Measure (mathematics)|measure]] and probability theory. Consider two [[independent random variables]] <math display="inline">X,\, Y</math> with [[probability mass function]]s <math>p_X(x)</math> and <math>p_Y(y)</math> respectively. The [[joint probability mass function]] is <math display="block"> p_{X, Y}\!\left(x, y\right) = \Pr(X = x,\, Y = y) = p_X\!(x)\,p_Y\!(y) </math> because <math display="inline">X</math> and <math display="inline">Y</math> are [[Independence (probability theory)|independent]]. The information content of the [[Outcome (probability)|outcome]] <math> (X, Y) = (x, y)</math> is<math display="block"> \begin{align} \operatorname{I}_{X,Y}(x, y) &= -\log_2\left[p_{X,Y}(x, y)\right] = -\log_2 \left[p_X\!(x)p_Y\!(y)\right] \\[5pt] &= -\log_2 \left[p_X{(x)}\right] -\log_2 \left[p_Y{(y)}\right] \\[5pt] &= \operatorname{I}_X(x) + \operatorname{I}_Y(y) \end{align} </math> See ''{{Section link||Two independent, identically distributed dice|nopage=y}}'' below for an example. The corresponding property for [[likelihood]]s is that the [[log-likelihood]] of independent events is the sum of the log-likelihoods of each event. Interpreting log-likelihood as "support" or negative surprisal (the degree to which an event supports a given model: a model is supported by an event to the extent that the event is unsurprising, given the model), this states that independent events add support: the information that the two events together provide for statistical inference is the sum of their independent information.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)