Editing Beta distribution (section)

===Measures of statistical dispersion===

====Variance====
The [[variance]] (the second moment centered on the mean) of a beta distribution [[random variable]] ''X'' with parameters ''α'' and ''β'' is:<ref name=JKB /><ref>{{cite web | url = http://www.itl.nist.gov/div898/handbook/eda/section3/eda366h.htm | title = NIST/SEMATECH e-Handbook of Statistical Methods 1.3.6.6.17. Beta Distribution | website = [[National Institute of Standards and Technology]] Information Technology Laboratory | access-date = May 31, 2016 |date = April 2012 }}</ref>

:<math>\operatorname{var}(X) = \operatorname{E}[(X - \mu)^2] = \frac{\alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}</math>

Letting ''α'' = ''β'' in the above expression one obtains

:<math>\operatorname{var}(X) = \frac{1}{4(2\beta + 1)},</math>

showing that for ''α'' = ''β'' the variance decreases monotonically as {{nowrap|1=''α'' = ''β''}} increases. Setting {{nowrap|1=''α'' = ''β'' = 0}} in this expression, one finds the maximum variance var(''X'') = 1/4<ref name=JKB /> which only occurs approaching the limit, at {{nowrap|1=''α'' = ''β'' = 0}}.

The beta distribution may also be [[Statistical parameter|parametrized]] in terms of its mean ''μ'' {{nowrap|1=(0 < ''μ'' < 1)}} and sample size {{nowrap|1=''ν'' = ''α'' + ''β''}} ({{nowrap|''ν'' > 0}}) (see subsection  [[#Mean and sample size|Mean and sample size]]):

:<math> \begin{align}
  \alpha &= \mu \nu, \text{ where }\nu =(\alpha + \beta) >0\\
  \beta &= (1 - \mu) \nu, \text{ where }\nu =(\alpha + \beta)  >0.
\end{align}</math>

Using this [[Statistical parameter|parametrization]], one can express the variance in terms of the mean ''μ'' and the sample size ''ν'' as follows:

:<math>\operatorname{var}(X) = \frac{\mu (1-\mu)}{1 + \nu}</math>

Since {{nowrap|1=''ν'' = ''α'' + ''β'' > 0}}, it follows that {{nowrap|var(''X'') < ''μ''(1 − ''μ'')}}.

For a symmetric distribution, the mean is at the middle of the distribution, {{nowrap|1=''μ'' = 1/2 }}, and therefore:

:<math>\operatorname{var}(X) = \frac{1}{4 (1 + \nu)} \text{ if } \mu = \tfrac{1}{2}</math>

Also, the following limits (with only the noted variable approaching the limit) can be obtained from the above expressions:

:<math> \begin{align}
&\lim_{\beta\to 0} \operatorname{var}(X) =\lim_{\alpha \to 0} \operatorname{var}(X) =\lim_{\beta\to \infty} \operatorname{var}(X) =\lim_{\alpha \to  \infty} \operatorname{var}(X) = \lim_{\nu \to  \infty} \operatorname{var}(X) =\lim_{\mu \to  0} \operatorname{var}(X) =\lim_{\mu \to  1} \operatorname{var}(X) = 0\\
&\lim_{\nu \to  0} \operatorname{var}(X) = \mu (1-\mu)
\end{align}</math>

[[File:Variance for Beta Distribution for alpha and beta ranging from 0 to 5 - J. Rodal.jpg|325px]]

====Geometric variance and covariance====
[[File:Beta distribution log geometric variances front view - J. Rodal.png|thumb|log geometric variances vs. ''α'' and ''β'']]
[[File:Beta distribution log geometric variances back view - J. Rodal.png|thumb|log geometric variances vs. ''α'' and ''β'']]

The logarithm of the geometric variance, ln(var<sub>''GX''</sub>), of a distribution with [[random variable]] ''X'' is the second moment of the logarithm of ''X'' centered on the geometric mean of ''X'', ln(''G<sub>X</sub>''):

:<math>\begin{align}
\ln \operatorname{var}_{GX} &= \operatorname{E} \left [(\ln X - \ln G_X)^2 \right ] \\
&= \operatorname{E}[(\ln X - \operatorname{E}\left [\ln X])^2 \right] \\
&= \operatorname{E}\left[(\ln X)^2 \right] - (\operatorname{E}[\ln X])^2\\
&= \operatorname{var}[\ln X]
\end{align}</math>

and therefore, the geometric variance is:

:<math>\operatorname{var}_{GX} = e^{\operatorname{var}[\ln X]}</math>

In the [[Fisher information]] matrix, and the curvature of the log [[likelihood function]], the logarithm of the geometric variance of the [[reflection formula|reflected]] variable 1&nbsp;−&nbsp;''X'' and the logarithm of the geometric covariance between ''X'' and 1&nbsp;−&nbsp;''X'' appear:

:<math>\begin{align}
\ln \operatorname{var_{G(1-X)}} &= \operatorname{E}[(\ln (1-X) - \ln G_{1-X})^2] \\
&= \operatorname{E}[(\ln (1-X) - \operatorname{E}[\ln (1-X)])^2] \\
&= \operatorname{E}[(\ln (1-X))^2] - (\operatorname{E}[\ln (1-X)])^2\\
&= \operatorname{var}[\ln (1-X)] \\
& \\
\operatorname{var_{G(1-X)}} &= e^{\operatorname{var}[\ln (1-X)]} \\
& \\
\ln \operatorname{cov_{G{X,1-X}}} &= \operatorname{E}[(\ln X - \ln G_X)(\ln (1-X) - \ln G_{1-X})] \\
&= \operatorname{E}[(\ln X - \operatorname{E}[\ln X])(\ln (1-X) - \operatorname{E}[\ln (1-X)])] \\
&= \operatorname{E}\left[\ln X \ln(1-X)\right] - \operatorname{E}[\ln X]\operatorname{E}[\ln(1-X)]\\
&= \operatorname{cov}[\ln X, \ln(1-X)] \\
& \\
\operatorname{cov}_{G{X,(1-X)}} &= e^{\operatorname{cov}[\ln X, \ln(1-X)]}
\end{align}</math>

For a beta distribution, higher order logarithmic moments can be derived by using the representation of a beta distribution as a proportion of two gamma distributions and differentiating through the integral. They can be expressed in terms of higher order poly-gamma functions. See the section {{section link||Moments of logarithmically transformed random variables}}.  The [[variance]] of the logarithmic variables and [[covariance]] of ln&nbsp;''X'' and ln(1−''X'') are:

: <math>\operatorname{var}[\ln X]= \psi_1(\alpha) - \psi_1(\alpha + \beta)</math>
: <math>\operatorname{var}[\ln (1-X)] = \psi_1(\beta) - \psi_1(\alpha + \beta)</math>
: <math>\operatorname{cov}[\ln X, \ln(1-X)] = -\psi_1(\alpha+\beta)</math>

where the '''[[trigamma function]]''', denoted ''ψ''<sub>1</sub>(''α''), is the second of the [[polygamma function]]s, and is defined as the derivative of the [[digamma function]]:

:<math>\psi_1(\alpha) = \frac{d^2\ln\Gamma(\alpha)}{d\alpha^2}= \frac{d \, \psi(\alpha)}{d\alpha}.</math>

Therefore,

:<math> \ln \operatorname{var}_{GX}=\operatorname{var}[\ln X]= \psi_1(\alpha) - \psi_1(\alpha + \beta) </math>
:<math> \ln \operatorname{var}_{G(1-X)} =\operatorname{var}[\ln (1-X)] = \psi_1(\beta) - \psi_1(\alpha + \beta)</math>
:<math> \ln \operatorname{cov}_{GX,1-X} =\operatorname{cov}[\ln X, \ln(1-X)] = -\psi_1(\alpha+\beta)</math>

The accompanying plots show the log geometric variances and log geometric covariance versus the shape parameters ''α'' and ''β''.  The plots show that the log geometric variances and log geometric covariance are close to zero for shape parameters ''α'' and ''β'' greater than 2, and that the log geometric variances rapidly rise in value for shape parameter values ''α'' and ''β'' less than unity. The log geometric variances are positive for all values of the shape parameters. The log geometric covariance is negative for all values of the shape parameters, and it reaches large negative values for ''α'' and ''β'' less than unity.

Following are the limits with one parameter finite (non-zero) and the other approaching these limits:

:<math> \begin{align}
&\lim_{\alpha\to  0} \ln \operatorname{var}_{GX} =  \lim_{\beta\to  0} \ln \operatorname{var}_{G(1-X)}  =\infty \\
&\lim_{\beta \to  0} \ln \operatorname{var}_{GX} = \lim_{\alpha \to  \infty} \ln \operatorname{var}_{GX} = \lim_{\alpha \to  0} \ln \operatorname{var}_{G(1-X)} = \lim_{\beta\to  \infty} \ln \operatorname{var}_{G(1-X)} = \lim_{\alpha\to  \infty} \ln \operatorname{cov}_{GX,(1-X)} =  \lim_{\beta\to  \infty} \ln \operatorname{cov}_{GX,(1-X)} = 0\\
&\lim_{\beta \to  \infty} \ln \operatorname{var}_{GX} =  \psi_1(\alpha)\\
&\lim_{\alpha\to  \infty}  \ln \operatorname{var}_{G(1-X)} =  \psi_1(\beta)\\
&\lim_{\alpha\to  0} \ln \operatorname{cov}_{GX,(1-X)} = - \psi_1(\beta)\\
&\lim_{\beta\to  0}  \ln \operatorname{cov}_{GX,(1-X)} = - \psi_1(\alpha)
\end{align}</math>

Limits with two parameters varying:

:<math> \begin{align}
&\lim_{\alpha\to  \infty}( \lim_{\beta \to \infty} \ln \operatorname{var}_{GX}) = \lim_{\beta \to  \infty}( \lim_{\alpha\to  \infty} \ln \operatorname{var}_{G(1-X)}) = \lim_{\alpha\to  \infty} (\lim_{\beta \to  0} \ln \operatorname{cov}_{GX,(1-X)}) = \lim_{\beta\to  \infty}( \lim_{\alpha\to  0} \ln \operatorname{cov}_{GX,(1-X)}) =0\\
&\lim_{\alpha\to  \infty} (\lim_{\beta \to  0} \ln \operatorname{var}_{GX}) = \lim_{\beta\to \infty} (\lim_{\alpha\to  0} \ln \operatorname{var}_{G(1-X)}) = \infty\\
&\lim_{\alpha\to  0} (\lim_{\beta \to  0} \ln \operatorname{cov}_{GX,(1-X)}) = \lim_{\beta\to 0} (\lim_{\alpha\to  0} \ln \operatorname{cov}_{GX,(1-X)}) = - \infty
\end{align}</math>

Although both ln(var<sub>''GX''</sub>) and ln(var<sub>''G''(1&nbsp;−&nbsp;''X'')</sub>) are asymmetric, when the shape parameters are equal, α = β, one has: ln(var<sub>''GX''</sub>) = ln(var<sub>''G(1−X)''</sub>). This equality follows from the following symmetry displayed between both log geometric variances:

:<math>\ln \operatorname{var}_{GX}(\Beta(\alpha, \beta))=\ln \operatorname{var}_{G(1-X)}(\Beta(\beta, \alpha)).</math>

The log geometric covariance is symmetric:

:<math>\ln \operatorname{cov}_{GX,(1-X)}(\Beta(\alpha, \beta) )=\ln \operatorname{cov}_{GX,(1-X)}(\Beta(\beta, \alpha))</math>

====Mean absolute deviation around the mean====
[[File:Ratio of Mean Abs. Dev. to Std.Dev. Beta distribution with alpha and beta from 0 to 5 - J. Rodal.jpg|thumb|Ratio of ,ean abs.dev. to std.dev. for beta distribution with α and β ranging from 0 to 5]]
[[File:Ratio of Mean Abs. Dev. to Std.Dev. Beta distribution vs. nu from 0 to 10 and vs. mean - J. Rodal.jpg|thumb|Ratio of mean abs.dev. to std.dev. for beta distribution with mean 0 ≤ ''μ'' ≤ 1 and sample size 0 < ''ν'' ≤ 10]]
The [[mean absolute deviation]] around the mean for the beta distribution with shape parameters ''α'' and ''β'' is:<ref name="Handbook of Beta Distribution" />

:<math>\operatorname{E}[|X - E[X]|] = \frac{2 \alpha^\alpha \beta^\beta}{\Beta(\alpha,\beta)(\alpha + \beta)^{\alpha + \beta + 1}} </math>

The mean absolute deviation around the mean is a more [[Robust statistics|robust]] [[estimator]] of [[statistical dispersion]] than the standard deviation for beta distributions with tails and inflection points at each side of the mode, Beta(''α'',&nbsp;''β'') distributions with ''α'',''β'' > 2, as it depends on the linear (absolute) deviations rather than the square deviations from the mean.  Therefore, the effect of very large deviations from the mean are not as overly weighted.

Using [[Stirling's approximation]] to the [[Gamma function]], [[Norman Lloyd Johnson|N.L.Johnson]] and [[Samuel Kotz|S.Kotz]]<ref name=JKB /> derived the following approximation for values of the shape parameters greater than unity (the relative error for this approximation is only −3.5% for ''α'' = ''β'' = 1, and it decreases to zero as ''α'' → ∞, ''β'' → ∞):

:<math> \begin{align}
\frac{\text{mean abs. dev. from mean}}{\text{standard deviation}} &=\frac{\operatorname{E}[|X - E[X]|]}{\sqrt{\operatorname{var}(X)}}\\
&\approx \sqrt{\frac{2}{\pi}} \left(1+\frac{7}{12 (\alpha+\beta)}{}-\frac{1}{12 \alpha}-\frac{1}{12 \beta} \right), \text{ if } \alpha, \beta > 1.
\end{align}</math>

At the limit ''α'' → ∞, ''β'' → ∞, the ratio of the mean absolute deviation to the standard deviation (for the beta distribution) becomes equal to the ratio of the same measures for the normal distribution: <math>\sqrt{\frac{2}{\pi}}</math>.  For ''α'' = ''β'' = 1 this ratio equals <math>\frac{\sqrt{3}}{2}</math>, so that from ''α'' = ''β'' = 1 to ''α'', ''β'' → ∞ the ratio decreases by 8.5%.  For ''α'' = ''β'' = 0 the standard deviation is exactly equal to the mean absolute deviation around the mean. Therefore, this ratio decreases by 15% from ''α'' = ''β'' = 0 to ''α'' = ''β'' = 1, and by 25% from ''α'' = ''β'' = 0 to ''α'', ''β'' → ∞ . However, for skewed beta distributions such that ''α'' → 0 or ''β'' → 0, the ratio of the standard deviation to the mean absolute deviation approaches infinity (although each of them, individually, approaches zero) because the mean absolute deviation approaches zero faster than the standard deviation.

Using the [[Statistical parameter|parametrization]] in terms of mean ''μ'' and sample size ''ν'' = ''α'' + ''β'' > 0:

:''α'' = ''μν'', ''β'' = (1 − ''μ'')''ν''

one can express the mean [[absolute deviation]] around the mean in terms of the mean ''μ'' and the sample size ''ν'' as follows:

:<math>\operatorname{E}[| X - E[X]|] = \frac{2 \mu^{\mu\nu} (1-\mu)^{(1-\mu)\nu}}{\nu \Beta(\mu \nu,(1-\mu)\nu)}</math>

For a symmetric distribution, the mean is at the middle of the distribution, ''μ'' = 1/2, and therefore:

:<math> \begin{align}
\operatorname{E}[|X - E[X]|]  = \frac{2^{1-\nu}}{\nu \Beta(\tfrac{\nu}{2} ,\tfrac{\nu}{2})} &= \frac{2^{1-\nu}\Gamma(\nu)}{\nu (\Gamma(\tfrac{\nu}{2}))^2 } \\
\lim_{\nu \to 0} \left (\lim_{\mu \to \frac{1}{2}} \operatorname{E}[|X - E[X]|] \right ) &= \tfrac{1}{2}\\
\lim_{\nu \to \infty} \left (\lim_{\mu \to \frac{1}{2}} \operatorname{E}[| X - E[X]|] \right ) &= 0
\end{align}</math>

Also, the following limits (with only the noted variable approaching the limit) can be obtained from the above expressions:

:<math> \begin{align}
\lim_{\beta\to  0} \operatorname{E}[|X - E[X]|] &=\lim_{\alpha \to  0} \operatorname{E}[|X - E[X]|]= 0 \\
\lim_{\beta\to  \infty} \operatorname{E}[|X - E[X]|] &=\lim_{\alpha \to  \infty} \operatorname{E}[|X - E[X]|] = 0\\
\lim_{\mu \to  0} \operatorname{E}[|X - E[X]|]&=\lim_{\mu \to  1} \operatorname{E}[|X - E[X]|] = 0\\
\lim_{\nu \to  0} \operatorname{E}[|X - E[X]|] &= \sqrt{\mu (1-\mu)} \\
\lim_{\nu \to  \infty} \operatorname{E}[|X - E[X]|] &= 0
\end{align}</math>

====Mean absolute difference====

The [[mean absolute difference]] for the beta distribution is:

:<math>\mathrm{MD} = \int_0^1 \int_0^1 f(x;\alpha,\beta)\,f(y;\alpha,\beta)\,|x-y|\,dx\,dy = \left(\frac{4}{\alpha+\beta}\right)\frac{B(\alpha+\beta,\alpha+\beta)}{B(\alpha,\alpha)B(\beta,\beta)}</math>

The [[Gini coefficient]] for the beta distribution is half of the relative mean absolute difference:

:<math>\mathrm{G} = \left(\frac{2}{\alpha}\right)\frac{B(\alpha+\beta,\alpha+\beta)}{B(\alpha,\alpha)B(\beta,\beta)}</math>