Editing Negative binomial distribution (section)

==Statistical inference==

===Parameter estimation===

====MVUE for ''p''====

Suppose {{mvar|p}} is unknown and an experiment is conducted where it is decided ahead of time that sampling will continue until {{mvar|r}} successes are found. A [[sufficient statistic]] for the experiment is {{mvar|k}}, the number of failures.

In estimating {{mvar|p}}, the [[minimum variance unbiased estimator]] is

: <math>\widehat{p}=\frac{r-1}{r+k-1}.</math>

====Maximum likelihood estimation====

When {{mvar|r}} is known, the [[maximum likelihood]] estimate of {{mvar|p}} is

: <math>\widetilde{p}=\frac{r}{r+k},</math>

but this is a [[bias of an estimator|biased estimate]]. Its inverse {{math|(''r'' + ''k'')/''r''}}, is an unbiased estimate of {{math|1/''p''}}, however.<ref>{{cite journal |first=J. B. S. |last=Haldane |author-link=J. B. S. Haldane |title=On a Method of Estimating Frequencies |journal=[[Biometrika]] |volume=33 |issue=3 |year=1945 |pages=222–225 |jstor=2332299 |doi=10.1093/biomet/33.3.222|pmid=21006837 |hdl=10338.dmlcz/102575 |hdl-access=free }}</ref>

When {{mvar|r}} is unknown, the maximum likelihood estimator for {{mvar|p}} and {{mvar|r}} together only exists for samples for which the sample variance is larger than the sample mean.<ref name="aramidis1999">{{cite journal|last=Aramidis |first=K. |year=1999 |title=An EM algorithm for estimating negative binomial parameters |journal=[[Australian & New Zealand Journal of Statistics]] |volume=41 |issue=2 |pages=213–221 |doi=10.1111/1467-842X.00075 |s2cid=118758171 |doi-access=free }}</ref> The [[likelihood function]] for {{mvar|N}} [[independent and identically-distributed random variables|iid]] observations {{math|(''k''{{sub|1}}, ..., ''k''{{sub|''N''}})}} is

:<math>L(r,p)=\prod_{i=1}^N f(k_i;r,p)\,\!</math>

from which we calculate the log-likelihood function

:<math>\ell(r,p) = \sum_{i=1}^N \ln(\Gamma(k_i + r)) - \sum_{i=1}^N \ln(k_i !) - N\ln(\Gamma(r)) + \sum_{i=1}^N k_i \ln(1-p) + Nr \ln(p).</math>

To find the maximum we take the partial derivatives with respect to {{mvar|r}} and {{mvar|p}} and set them equal to zero:

:<math>\frac{\partial \ell(r,p)}{\partial p} = -\left[\sum_{i=1}^N k_i \frac{1}{1-p}\right] + Nr \frac{1}{p} = 0</math> and

:<math>\frac{\partial \ell(r,p)}{\partial r} = \left[\sum_{i=1}^N \psi(k_i + r)\right] - N\psi(r) + N\ln(p) = 0</math>

where

: <math>\psi(k) = \frac{\Gamma'(k)}{\Gamma(k)} \!</math> is the [[digamma function]].

Solving the first equation for {{mvar|p}} gives:

:<math>p = \frac{Nr} {Nr + \sum_{i=1}^N k_i}</math>

Substituting this in the second equation gives:

:<math>\frac{\partial \ell(r,p)}{\partial r} = \left[\sum_{i=1}^N \psi(k_i + r)\right] - N\psi(r) + N\ln\left(\frac{r}{r + \sum_{i=1}^N k_i/N}\right) = 0</math>

This equation cannot be solved for {{mvar|r}} in [[Closed-form expression|closed form]]. If a numerical solution is desired, an iterative technique such as [[Newton's method]] can be used. Alternatively, the [[expectation–maximization algorithm]] can be used.<ref name="aramidis1999" />