Editing Completeness (statistics)

{{short description|Statistics term}}

In [[statistics]], '''completeness''' is a property of a [[statistic]] computed on a [[sample (statistics)|sample dataset]] in relation to a parametric model of the dataset. It is opposed to the concept of an [[ancillary statistic]]. While an ancillary statistic contains no information about the model parameters, a complete statistic contains only information about the parameters, and no ancillary information. It is closely related to the concept of a [[sufficient statistic]] which contains all of the information that the dataset provides about the parameters.<ref name="casellaberger">
{{cite book |last1=Casella |first1=George |last2=Berger |first2=Roger W. |title=Statistical inference |date=2001 |publisher=CRC Press |isbn=978-1-032-59303-6}}
</ref>

==Definition==
Consider a [[random variable]] ''X'' whose probability distribution belongs to a [[parametric model]]  '''''P'''''<sub>''θ''</sub> parametrized by&nbsp;''θ''.

Say ''T'' is a [[statistic]]; that is, the composition of a [[measurable function]] with a random sample ''X''<sub>1</sub>,...,''X''<sub>n</sub>.

The statistic ''T'' is said to be '''complete''' for the distribution of ''X'' if, for every measurable function ''g,''<ref name="casellaberger" />

:<math>\text{if }\operatorname{E}_\theta(g(T))=0\text{ for all }\theta\text{ then }\mathbf{P}_\theta(g(T)=0)=1\text{ for all }\theta.</math>

The statistic ''T''  is said to be '''boundedly complete''' for the distribution of ''X'' if this implication holds for every measurable function ''g'' that is also bounded.
== Examples ==

=== Bernoulli model ===

The Bernoulli model admits a complete statistic.<ref name="casellaberger" /> Let ''X'' be a [[random sample]] of size ''n'' such that each ''X''<sub>''i''</sub> has the same [[Bernoulli distribution]] with parameter ''p''. Let ''T'' be the number of 1s observed in the sample, i.e. <math>\textstyle T = \sum_{i=1}^n X_i</math>. ''T'' is a statistic of ''X'' which has  a [[binomial distribution]] with parameters (''n'',''p''). If the parameter space for ''p'' is (0,1), then ''T'' is a complete statistic. To see this, note that

:<math> \operatorname{E}_p(g(T)) = \sum_{t=0}^n {g(t){n \choose t}p^{t}(1-p)^{n-t}} = (1-p)^n \sum_{t=0}^n {g(t){n \choose t}\left(\frac{p}{1-p}\right)^t} .</math>

Observe also that neither ''p'' nor 1&nbsp;&minus;&nbsp;''p'' can be 0. Hence <math>E_p(g(T)) = 0</math> if and only if:

:<math>\sum_{t=0}^n g(t){n \choose t}\left(\frac{p}{1-p}\right)^t = 0. </math>

On denoting ''p''/(1&nbsp;&minus;&nbsp;''p'') by ''r'', one gets:

:<math>\sum_{t=0}^n g(t){n \choose t}r^t = 0 .</math>

First, observe that the range of ''r'' is the [[positive reals]]. Also, E(''g''(''T'')) is a [[polynomial]] in ''r'' and, therefore, can only be identical to 0 if all coefficients are 0, that is, ''g''(''t'')&nbsp;=&nbsp;0 for all&nbsp;''t''.

It is important to notice that the result that all coefficients must be 0 was obtained because of the range of ''r''. Had the parameter space been finite and with a number of elements less than or equal to ''n'', it might be possible to solve the linear equations in ''g''(''t'') obtained by substituting the values of ''r'' and get solutions different from 0. For example, if ''n'' = 1 and the parameter space is {0.5}, a single observation and a single parameter value, ''T'' is not complete. Observe that, with the definition:

:<math> g(t) = 2(t-0.5), \, </math>

then, E(''g''(''T'')) = 0 although ''g''(''t'') is not 0 for ''t'' = 0 nor for ''t'' = 1.

=== Gaussian model with fixed variance ===

This example will show that, in a sample ''X''<sub>1</sub>,&nbsp;''X''<sub>2</sub> of size 2 from a [[normal distribution]] with known variance, the statistic ''X''<sub>1</sub>&nbsp;+&nbsp;''X''<sub>2</sub> is complete and sufficient. Suppose ''X''<sub>1</sub>, ''X''<sub>2</sub> are [[statistical independence|independent]], identically distributed random variables, [[normal distribution|normally distributed]] with expectation ''θ'' and variance 1.
The sum

:<math>s((X_1, X_2)) = X_1 + X_2</math>

is a '''complete statistic''' for ''θ''.

To show this, it is sufficient to demonstrate that there is no non-zero function <math>g</math> such that the expectation of

:<math>g(s(X_1, X_2)) = g(X_1+X_2)</math>

remains zero regardless of the value of ''θ''.

That fact may be seen as follows. The probability distribution of ''X''<sub>1</sub>&nbsp;+&nbsp;''X''<sub>2</sub> is normal with expectation 2''θ'' and variance 2. Its probability density function in <math>x</math> is therefore proportional to

:<math>\exp\left(-(x-2\theta)^2/4\right).</math>

The expectation of ''g'' above would therefore be a constant times

:<math>\int_{-\infty}^\infty g(x)\exp\left(-(x-2\theta)^2/4\right)\,dx.</math>

A bit of algebra reduces this to

:<math>k(\theta) \int_{-\infty}^\infty h(x)e^{x\theta}\,dx,</math>

where ''k''(''θ'') is nowhere zero and

:<math>h(x)=g(x)e^{-x^2/4}.</math>

As a function of ''θ'' this is a two-sided [[Laplace transform]] of ''h'', and cannot be identically zero unless ''h'' is zero almost everywhere.<ref name="Lynn 1986 pp. 225–272">{{cite book | last=Lynn | first=Paul A. | title=Electronic Signals and Systems | chapter=The Laplace Transform and the ''z''-transform | publisher=Macmillan Education UK | publication-place=London | year=1986 | isbn=978-0-333-39164-8 | doi=10.1007/978-1-349-18461-3_6 | pages=225–272}}</ref> The exponential is not zero, so this can only happen if ''g'' is zero almost everywhere.

By contrast, the statistic <math display=inline> (X_1,X_2) </math> is sufficient but not complete. It admits a non-zero unbiased estimator of zero, namely <math display=inline> X_1-X_2</math>.

=== Sufficiency does not imply completeness ===

Most parametric models have a [[sufficient statistic]] which is not complete. This is important because the [[Lehmann–Scheffé theorem]] cannot be applied to such models. Galili and Meilijson 2016 <ref name="galili">{{cite journal|title= An Example of an Improvable Rao–Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator |author1=Tal Galili |author2=Isaac Meilijson | date = 31 Mar 2016 | journal = The American Statistician | volume = 70 | issue = 1 | pages = 108–113 |doi=10.1080/00031305.2015.1100683| pmc = 4960505 | pmid=27499547}}</ref> propose the following didactic example.

Consider <math>n</math> independent samples from the uniform distribution:

:<math>
X_i \sim U \big( (1-k) \theta , (1+k)\theta \big)
\qquad\qquad
0 < k < 1
</math>

<math>k</math> is a known design parameter. This model is a ''scale family'' (a specific case of [[Location-scale_family|a location-scale family]]) model: scaling the samples by a multiplier <math>c</math> multiplies the parameter <math>\theta</math>.

Galili and Meilijson show that the minimum and maximum of the samples are together a sufficient statistic: <math>X_{(1)}, X_{(n)}</math> (using the usual notation for [[order_statistic|order statistics]]). Indeed, conditional on these two values, the distribution of the rest of the sample is simply uniform on the range they define: <math>\left[X_{(1)}, X_{(n)}\right]</math>.

However, their ratio has a distribution which does not depend on <math>\theta</math>. This follows from the fact that this is a scale family: any change of scale impacts both variables identically. Subtracting the mean <math>m</math> from that distribution, we obtain:

:<math>
\mathbb E \left[ \frac {X_{(n)}} {X_{(1)} } \right] - m = 0
</math>

We have thus shown that there exists a function <math>g\left(X_{(1)}, X_{(n)}\right)</math> which is not <math>0</math> everywhere but which has expectation <math>0</math>. The pair is thus not complete.

==Importance of completeness==
The notion of completeness has many applications in statistics, particularly in the following theorems of mathematical statistics.

===Lehmann–Scheffé theorem===
'''Completeness''' occurs in the [[Lehmann–Scheffé theorem]],<ref name="casellaberger" />
which states that if a statistic that is unbiased, '''complete''' and [[sufficiency (statistics)|sufficient]] for some parameter ''θ'', then it is the best mean-unbiased estimator for&nbsp;''θ''. In other words, this statistic has a smaller expected loss for any [[convex function|convex]] loss function; in many practical applications with the squared loss-function, it has a smaller mean squared error among any estimators with the same [[expected value]].

Examples exists that when the minimal sufficient statistic is '''not complete''' then several alternative statistics exist for unbiased estimation of ''θ'', while some of them have lower variance than others.<ref name="galili" />

See also [[minimum-variance unbiased estimator]].

===Basu's theorem===
'''Bounded completeness''' occurs in [[Basu's theorem]],<ref name="casellaberger" /> which states that a statistic that is both '''boundedly complete'''  and [[Sufficient statistic|sufficient]] is [[statistical independence|independent]] of any [[ancillary statistic]].

===Bahadur's theorem===

'''Bounded completeness''' also occurs in Bahadur's theorem. In the case where there exists at least one [[minimal sufficient]] statistic, a statistic which is [[sufficient]] and boundedly complete, is necessarily minimal sufficient.<ref>{{Cite journal |last=Bahadur |first=R. R. |date=1957 |title=On Unbiased Estimates of Uniformly Minimum Variance |url=https://www.jstor.org/stable/25048353 |journal=Sankhyā: The Indian Journal of Statistics (1933-1960) |volume=18 |issue=3/4 |pages=211–224 |issn=0036-4452}}</ref>

==Notes==
{{reflist}}

{{Statistics|inference|collapsed}}

{{DEFAULTSORT:Completeness (Statistics)}}
[[Category:Statistical theory]]