Editing Estimator (section)

==Behavioral properties==

===Consistency===
{{Main|Consistent estimator}}
A '''consistent estimator''' is an estimator whose sequence of estimates [[convergence in probability|converge in probability]] to the quantity being estimated as the index (usually the [[sample size]]) grows without bound. In other words, increasing the sample size increases the probability of the estimator being close to the population parameter.

Mathematically, an estimator is a consistent estimator for [[parameter]] ''θ'', if and only if for the sequence of estimates {{nowrap|{''t<sub>n</sub>''; ''n'' ≥ 0}}}, and for all {{nowrap|''ε'' > 0}}, no matter how small, we have
:<math>
\lim_{n\to\infty}\Pr\left\{
\left|
t_n-\theta\right|<\varepsilon
\right\}=1
</math>.

The consistency defined above may be called weak consistency. The sequence is ''strongly consistent'', if it [[Almost sure convergence|converges almost surely]] to the true value.

An estimator that converges to a ''multiple'' of a parameter can be made into a consistent estimator by multiplying the estimator by a [[scale factor]], namely the true value divided by the asymptotic value of the estimator. This occurs frequently in [[Scale parameter#Estimation|estimation of scale parameters]] by [[Statistical dispersion#Measures of statistical dispersion|measures of statistical dispersion]].

===Fisher consistency===
An estimator can be considered Fisher consistent as long as the estimator is the same functional of the empirical distribution function as the true distribution function. Following the formula:
:<math>\widehat{\theta} = h(T_n),      \theta  = h(T_\theta)</math>
Where <math>T_n</math> and <math>T_\theta</math> are the [[empirical distribution function]] and theoretical distribution function, respectively. 
An easy example to see if some estimator is Fisher consistent is to check the consistency of mean and variance. For example, to check consistency for the mean <math>\widehat{\mu} = \bar{X}</math> and to check for variance confirm that <math>\widehat{\sigma}^2 = SSD/n</math>.<ref>{{cite web |last1=Lauritzen |first1=Steffen |title=Properties of Estimators |url=https://www.stats.ox.ac.uk/~steffen/teaching/bs2siMT04/si2c.pdf |publisher=University of Oxford |access-date=9 December 2023}}</ref>

===Asymptotic normality===
{{Main|Asymptotic normality}}
An [[asymptotic distribution#Asymptotic normality|asymptotically normal]] estimator is a consistent estimator whose distribution around the true parameter ''θ'' approaches a [[normal distribution]] with standard deviation shrinking in proportion to <math>1/\sqrt{n}</math> as the sample size ''n'' grows.  Using <math>\xrightarrow{D}</math> to denote [[Convergence of random variables#Convergence in distribution|convergence in distribution]], ''t<sub>n</sub>'' is [[Asymptotic normality|asymptotically normal]] if
:<math>\sqrt{n}(t_n - \theta) \xrightarrow{D} N(0,V),</math>
for some ''V''.

In this formulation ''V/n'' can be called the ''asymptotic variance'' of the estimator. However, some authors also call ''V'' the ''asymptotic variance''.
Note that convergence will not necessarily have occurred for any finite "n", therefore this value is only an approximation to the true variance of the estimator, while in the limit the asymptotic variance (V/n) is simply zero. To be more specific, the distribution of the estimator ''t<sub>n</sub>'' converges weakly to a [[dirac delta function]] centered at <math>\theta</math>.

The [[central limit theorem]] implies asymptotic normality of the [[sample mean]] <math>\bar X</math> as an estimator of the true mean.
More generally, [[maximum likelihood]] estimators are asymptotically normal under fairly weak regularity conditions — see the [[maximum likelihood#Asymptotics|asymptotics section]] of the maximum likelihood article. However, not all estimators are asymptotically normal; the simplest examples are found when the true value of a parameter lies on the boundary of the allowable parameter region.

===Efficiency===
{{Main|Efficiency (statistics)}}

The efficiency of an estimator is used to estimate the quantity of interest in a "minimum error" manner. In reality, there is not an explicit best estimator; there can only be a better estimator. Whether the efficiency of an estimator is better or not is based on the choice of a particular [[loss function]], and it is reflected by two naturally desirable properties of estimators: to be unbiased <math>\operatorname{E}(\widehat{\theta}) - \theta=0</math> and have minimal [[mean squared error]] (MSE) <math>\operatorname{E}[(\widehat{\theta} - \theta )^2]</math>. These cannot in general both be satisfied simultaneously: a biased estimator may have a lower mean squared error than any unbiased estimator (see [[estimator bias]]).
This equation relates the mean squared error with the estimator bias:<ref name=Dekker2005 />

: <math> \operatorname{E}[(\widehat{\theta} - \theta )^2]=(\operatorname{E}(\widehat{\theta}) - \theta)^2+\operatorname{Var}(\widehat\theta)\ </math>

The first term represents the mean squared error; the second term represents the square of the estimator bias; and the third term represents the variance of the estimator. The quality of the estimator can be identified from the comparison between the variance, the square of the estimator bias, or the MSE. The variance of the good estimator (good efficiency) would be smaller than the variance of the bad estimator (bad efficiency). The square of an estimator bias with a good estimator would be smaller than the estimator bias with a bad estimator. The MSE of a good estimator would be smaller than the MSE of the bad estimator. Suppose there are two estimator, <math>\widehat\theta_1</math> is the good estimator and <math>\widehat\theta_2</math> is the bad estimator. The above relationship can be expressed by the following formulas.

: <math>\operatorname{Var}(\widehat\theta_1)<\operatorname{Var}(\widehat\theta_2)</math>

: <math>|\operatorname{E}(\widehat\theta_1) - \theta|<\left|\operatorname{E}(\widehat\theta_2) - \theta\right|</math>

: <math>\operatorname{MSE}(\widehat\theta_1)<\operatorname{MSE}(\widehat\theta_2)</math>

Besides using formula to identify the efficiency of the estimator, it can also be identified through the graph. If an estimator is efficient, in the frequency vs. value graph, there will be a curve with high frequency at the center and low frequency on the two sides. For example:
[[File:Good estimator.jpg|center|thumb]]
If an estimator is not efficient, the frequency vs. value graph, there will be a relatively more gentle curve.
[[File:Bad estimator.jpg|center|thumb]]
To put it simply, the good estimator has a narrow curve, while the bad estimator has a large curve. Plotting these two curves on one graph with a shared ''y''-axis, the difference becomes more obvious.
[[File:The comparsion between a good and a bad estimator.jpg|center|thumb|Comparison between good and bad estimator.]]

Among unbiased estimators, there often exists one with the lowest variance, called the minimum variance unbiased estimator ([[MVUE]]). In some cases an unbiased [[efficiency (statistics)|efficient estimator]] exists, which, in addition to having the lowest variance among unbiased estimators, satisfies the [[Cramér–Rao bound]], which is an absolute lower bound on variance for statistics of a variable.

Concerning such "best unbiased estimators", see also [[Cramér–Rao bound]], [[Gauss–Markov theorem]], [[Lehmann–Scheffé theorem]], [[Rao–Blackwell theorem]].

===Robustness===
{{main|Robust estimator}}
{{further|Robust regression}}