Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Normal distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Estimation of parameters === {{See also|Maximum likelihood#Continuous distribution, continuous parameter space|Gaussian function#Estimation of parameters}} It is often the case that we do not know the parameters of the normal distribution, but instead want to [[Estimation theory|estimate]] them. That is, having a sample <math display=inline>(x_1, \ldots, x_n)</math> from a normal <math display=inline>\mathcal{N}(\mu, \sigma^2)</math> population we would like to learn the approximate values of parameters {{tmath|\mu}} and <math display=inline>\sigma^2</math>. The standard approach to this problem is the [[maximum likelihood]] method, which requires maximization of the ''[[log-likelihood function]]'':{{anchor|Log-likelihood}} <math display=block> \ln\mathcal{L}(\mu,\sigma^2) = \sum_{i=1}^n \ln f(x_i\mid\mu,\sigma^2) = -\frac{n}{2}\ln(2\pi) - \frac{n}{2}\ln\sigma^2 - \frac{1}{2\sigma^2}\sum_{i=1}^n (x_i-\mu)^2. </math> Taking derivatives with respect to {{tmath|\mu}} and <math display=inline>\sigma^2</math> and solving the resulting system of first order conditions yields the ''maximum likelihood estimates'': <math display=block> \hat{\mu} = \overline{x} \equiv \frac{1}{n}\sum_{i=1}^n x_i, \qquad \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n (x_i - \overline{x})^2. </math> Then <math display=inline>\ln\mathcal{L}(\hat{\mu},\hat{\sigma}^2)</math> is as follows: <math display=block>\ln\mathcal{L}(\hat{\mu},\hat{\sigma}^2) = (-n/2) [\ln(2 \pi \hat{\sigma}^2)+1]</math> ==== Sample mean ==== {{See also|Standard error of the mean}} Estimator <math style="vertical-align:-.3em">\textstyle\hat\mu</math> is called the ''[[sample mean]]'', since it is the arithmetic mean of all observations. The statistic <math style="vertical-align:0">\textstyle\overline{x}</math> is [[complete statistic|complete]] and [[sufficient statistic|sufficient]] for {{tmath|\mu}}, and therefore by the [[Lehmann–Scheffé theorem]], <math style="vertical-align:-.3em">\textstyle\hat\mu</math> is the [[uniformly minimum variance unbiased]] (UMVU) estimator.<ref name="Krishnamoorthy">{{harvtxt |Krishnamoorthy |2006 |p=127 }}</ref> In finite samples it is distributed normally: <math display=block> \hat\mu \sim \mathcal{N}(\mu,\sigma^2/n). </math> The variance of this estimator is equal to the ''μμ''-element of the inverse [[Fisher information matrix]] <math style="vertical-align:0">\textstyle\mathcal{I}^{-1}</math>. This implies that the estimator is [[efficient estimator|finite-sample efficient]]. Of practical importance is the fact that the [[standard error]] of <math style="vertical-align:-.3em">\textstyle\hat\mu</math> is proportional to <math style="vertical-align:-.3em">\textstyle1/\sqrt{n}</math>, that is, if one wishes to decrease the standard error by a factor of 10, one must increase the number of points in the sample by a factor of 100. This fact is widely used in determining sample sizes for opinion polls and the number of trials in [[Monte Carlo simulation]]s. From the standpoint of the [[asymptotic theory (statistics)|asymptotic theory]], <math style="vertical-align:-.3em">\textstyle\hat\mu</math> is [[consistent estimator|consistent]], that is, it [[converges in probability]] to {{tmath|\mu}} as <math display=inline>n\rightarrow\infty</math>. The estimator is also [[asymptotic normality|asymptotically normal]], which is a simple corollary of the fact that it is normal in finite samples: <math display=block> \sqrt{n}(\hat\mu-\mu) \,\xrightarrow{d}\, \mathcal{N}(0,\sigma^2). </math> ==== Sample variance ==== {{See also|Standard deviation#Estimation|Variance#Estimation}} The estimator <math style="vertical-align:0">\textstyle\hat\sigma^2</math> is called the ''[[sample variance]]'', since it is the variance of the sample (<math display=inline>(x_1, \ldots, x_n)</math>). In practice, another estimator is often used instead of the <math style="vertical-align:0">\textstyle\hat\sigma^2</math>. This other estimator is denoted <math display=inline>s^2</math>, and is also called the ''sample variance'', which represents a certain ambiguity in terminology; its square root {{tmath|s}} is called the ''sample standard deviation''. The estimator <math display=inline>s^2</math> differs from <math style="vertical-align:0">\textstyle\hat\sigma^2</math> by having {{math|(''n'' − 1)}} instead of ''n'' in the denominator (the so-called [[Bessel's correction]]): <math display=block> s^2 = \frac{n}{n-1} \hat\sigma^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \overline{x})^2. </math> The difference between <math display=inline>s^2</math> and <math style="vertical-align:0">\textstyle\hat\sigma^2</math> becomes negligibly small for large ''n''{{'}}s. In finite samples however, the motivation behind the use of <math display=inline>s^2</math> is that it is an [[unbiased estimator]] of the underlying parameter <math display=inline>\sigma^2</math>, whereas <math style="vertical-align:0">\textstyle\hat\sigma^2</math> is biased. Also, by the Lehmann–Scheffé theorem the estimator <math display=inline>s^2</math> is uniformly minimum variance unbiased ([[UMVU]]),<ref name="Krishnamoorthy" /> which makes it the "best" estimator among all unbiased ones. However it can be shown that the biased estimator <math style="vertical-align:0">\textstyle\hat\sigma^2</math> is better than the <math display=inline>s^2</math> in terms of the [[mean squared error]] (MSE) criterion. In finite samples both <math display=inline>s^2</math> and <math style="vertical-align:0">\textstyle\hat\sigma^2</math> have scaled [[chi-squared distribution]] with {{math|(''n'' − 1)}} degrees of freedom: <math display=block> s^2 \sim \frac{\sigma^2}{n-1} \cdot \chi^2_{n-1}, \qquad \hat\sigma^2 \sim \frac{\sigma^2}{n} \cdot \chi^2_{n-1}. </math> The first of these expressions shows that the variance of <math display=inline>s^2</math> is equal to <math display=inline>2\sigma^4/(n-1)</math>, which is slightly greater than the ''σσ''-element of the inverse Fisher information matrix <math style="vertical-align:0">\textstyle\mathcal{I}^{-1}</math>, which is <math display=inline>2\sigma^4/n</math>. Thus, <math display=inline>s^2</math> is not an efficient estimator for <math display=inline>\sigma^2</math>, and moreover, since <math display=inline>s^2</math> is UMVU, we can conclude that the finite-sample efficient estimator for <math display=inline>\sigma^2</math> does not exist. Applying the asymptotic theory, both estimators <math display=inline>s^2</math> and <math style="vertical-align:0">\textstyle\hat\sigma^2</math> are consistent, that is they converge in probability to <math display=inline>\sigma^2</math> as the sample size <math display=inline>n\rightarrow\infty</math>. The two estimators are also both asymptotically normal: <math display=block> \sqrt{n}(\hat\sigma^2 - \sigma^2) \simeq \sqrt{n}(s^2-\sigma^2) \,\xrightarrow{d}\, \mathcal{N}(0,2\sigma^4). </math> In particular, both estimators are asymptotically efficient for <math display=inline>\sigma^2</math>.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)