Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Geometric distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Entropy and Fisher's Information== ===Entropy (Geometric Distribution, Failures Before Success)=== Entropy is a measure of uncertainty in a probability distribution. For the geometric distribution that models the number of failures before the first success, the probability mass function is: :<math>P(X = k) = (1 - p)^k p, \quad k = 0, 1, 2, \dots</math> The entropy <math>H(X)</math> for this distribution is defined as: :<math>\begin{align} H(X) &= - \sum_{k=0}^{\infty} P(X = k) \ln P(X = k) \\ &= - \sum_{k=0}^{\infty} (1 - p)^k p \ln \left( (1 - p)^k p \right) \\ &= - \sum_{k=0}^{\infty} (1 - p)^k p \left[ k \ln(1 - p) + \ln p \right] \\ &= -\log p - \frac{1 - p}{p} \log(1 - p) \end{align}</math> The entropy increases as the probability <math>p</math> decreases, reflecting greater uncertainty as success becomes rarer. ===Fisher's Information (Geometric Distribution, Failures Before Success)=== Fisher information measures the amount of information that an observable random variable <math>X</math> carries about an unknown parameter <math>p</math>. For the geometric distribution (failures before the first success), the Fisher information with respect to <math>p</math> is given by: :<math>I(p) = \frac{1}{p^2(1 - p)}</math> '''Proof:''' *The '''Likelihood Function''' for a geometric random variable <math>X</math> is: :<math>L(p; X) = (1 - p)^X p</math> *The '''Log-Likelihood Function''' is: :<math>\ln L(p; X) = X \ln(1 - p) + \ln p</math> *The Score Function (first derivative of the log-likelihood w.r.t. <math>p</math>) is: :<math>\frac{\partial}{\partial p} \ln L(p; X) = \frac{1}{p} - \frac{X}{1 - p}</math> *The second derivative of the log-likelihood function is: :<math>\frac{\partial^2}{\partial p^2} \ln L(p; X) = -\frac{1}{p^2} - \frac{X}{(1 - p)^2}</math> *'''Fisher Information''' is calculated as the negative expected value of the second derivative: :<math>\begin{align} I(p) &= -E\left[\frac{\partial^2}{\partial p^2} \ln L(p; X)\right] \\ &= - \left(-\frac{1}{p^2} - \frac{1 - p}{p (1 - p)^2} \right) \\ &= \frac{1}{p^2(1 - p)} \end{align}</math> Fisher information increases as <math>p</math> decreases, indicating that rarer successes provide more information about the parameter <math>p</math>. ===Entropy (Geometric Distribution, Trials Until Success)=== For the geometric distribution modeling the number of trials until the first success, the probability mass function is: :<math>P(X = k) = (1 - p)^{k - 1} p, \quad k = 1, 2, 3, \dots</math> The entropy <math>H(X)</math> for this distribution is given by: :<math>\begin{align} H(X) &= - \sum_{k=1}^{\infty} P(X = k) \ln P(X = k) \\ &= - \sum_{k=1}^{\infty} (1 - p)^{k - 1} p \ln \left( (1 - p)^{k - 1} p \right) \\ &= - \sum_{k=1}^{\infty} (1 - p)^{k - 1} p \left[ (k - 1) \ln(1 - p) + \ln p \right] \\ &= - \log p + \frac{1 - p}{p} \log(1 - p) \end{align}</math> Entropy increases as <math>p</math> decreases, reflecting greater uncertainty as the probability of success in each trial becomes smaller. ===Fisher's Information (Geometric Distribution, Trials Until Success)=== Fisher information for the geometric distribution modeling the number of trials until the first success is given by: :<math>I(p) = \frac{1}{p^2(1 - p)}</math> '''Proof:''' *The '''Likelihood Function''' for a geometric random variable <math>X</math> is: :<math>L(p; X) = (1 - p)^{X - 1} p</math> *The '''Log-Likelihood Function''' is: :<math>\ln L(p; X) = (X - 1) \ln(1 - p) + \ln p</math> *The Score Function (first derivative of the log-likelihood w.r.t. <math>p</math>) is: :<math>\frac{\partial}{\partial p} \ln L(p; X) = \frac{1}{p} - \frac{X - 1}{1 - p}</math> *The second derivative of the log-likelihood function is: :<math>\frac{\partial^2}{\partial p^2} \ln L(p; X) = -\frac{1}{p^2} - \frac{X - 1}{(1 - p)^2}</math> *'''Fisher Information''' is calculated as the negative expected value of the second derivative: :<math>\begin{align} I(p) &= -E\left[\frac{\partial^2}{\partial p^2} \ln L(p; X)\right] \\ &= - \left(-\frac{1}{p^2} - \frac{1 - p}{p (1 - p)^2} \right) \\ &= \frac{1}{p^2(1 - p)} \end{align}</math> === General properties === * The [[probability-generating function|probability generating function]]s of geometric random variables <math> X </math> and <math> Y </math> defined over <math> \mathbb{N} </math> and <math> \mathbb{N}_0 </math> are, respectively,<ref name=":0" />{{Rp|pages=114β115}} ::<math>\begin{align} G_X(s) & = \frac{s\,p}{1-s\,(1-p)}, \\[10pt] G_Y(s) & = \frac{p}{1-s\,(1-p)}, \quad |s| < (1-p)^{-1}. \end{align}</math> * The [[Characteristic function (probability theory)|characteristic function]] <math>\varphi(t)</math> is equal to <math>G(e^{it})</math> so the geometric distribution's characteristic function, when defined over <math> \mathbb{N} </math> and <math> \mathbb{N}_0 </math> respectively, is<ref name=":9">{{Cite book |url=http://link.springer.com/10.1007/978-3-642-04898-2 |title=International Encyclopedia of Statistical Science |publisher=Springer Berlin Heidelberg |year=2011 |isbn=978-3-642-04897-5 |editor-last=Lovric |editor-first=Miodrag |edition=1st |location=Berlin, Heidelberg |language=en |doi=10.1007/978-3-642-04898-2}}</ref>{{Rp|page=1630}}<math display="block">\begin{align} \varphi_X(t) &= \frac{pe^{it}}{1-(1-p)e^{it}},\\[10pt] \varphi_Y(t) &= \frac{p}{1-(1-p)e^{it}}. \end{align}</math> * The [[Entropy (information theory)|entropy]] of a geometric distribution with parameter <math>p</math> is<ref name=":7" /><math display="block">-\frac{p \log_2 p + (1-p) \log_2 (1-p)}{p}</math> * Given a [[mean]], the geometric distribution is the [[maximum entropy probability distribution]] of all discrete probability distributions. The corresponding continuous distribution is the [[exponential distribution]].<ref>{{Cite journal |last1=Lisman |first1=J. H. C. |last2=Zuylen |first2=M. C. A. van |date=March 1972 |title=Note on the generation of most probable frequency distributions |url=https://onlinelibrary.wiley.com/doi/10.1111/j.1467-9574.1972.tb00152.x |journal=[[Statistica Neerlandica]] |language=en |volume=26 |issue=1 |pages=19β23 |doi=10.1111/j.1467-9574.1972.tb00152.x |issn=0039-0402}}</ref> * The geometric distribution defined on <math> \mathbb{N}_0 </math> is [[infinite divisibility (probability)|infinitely divisible]], that is, for any positive integer <math>n</math>, there exist <math>n</math> independent identically distributed random variables whose sum is also geometrically distributed. This is because the negative binomial distribution can be derived from a Poisson-stopped sum of [[Logarithmic distribution|logarithmic random variables]].<ref name=":9" />{{Rp|pages=606β607}} * The decimal digits of the geometrically distributed random variable ''Y'' are a sequence of [[statistical independence|independent]] (and ''not'' identically distributed) random variables.{{citation needed|date=May 2012}} For example, the <!-- "hundreds" is correct; "hundredth" is wrong -->hundreds<!-- "hundreds" is correct; "hundredth" is wrong --> digit ''D'' has this probability distribution: ::<math>\Pr(D=d) = {q^{100d} \over 1 + q^{100} + q^{200} + \cdots + q^{900}},</math> :where ''q'' = 1 − ''p'', and similarly for the other digits, and, more generally, similarly for [[numeral system]]s with other bases than 10. When the base is 2, this shows that a geometrically distributed random variable can be written as a sum of independent random variables whose probability distributions are [[indecomposable distribution|indecomposable]]. * [[Golomb coding]] is the optimal [[prefix code]]{{clarify|date=May 2012}} for the geometric discrete distribution.<ref name=":7">{{Cite journal|last1=Gallager|first1=R.|last2=van Voorhis|first2=D.|date=March 1975|title=Optimal source codes for geometrically distributed integer alphabets (Corresp.)|journal=IEEE Transactions on Information Theory|volume=21|issue=2|pages=228β230|doi=10.1109/TIT.1975.1055357|issn=0018-9448}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)