Editing Standard error (section)

==Standard error of the sample mean==

=== Exact value ===
Suppose a statistically independent sample of <math>n</math> observations <math> x_1, x_2 , \ldots, x_n </math> is taken from a [[statistical population]] with a [[standard deviation]] of <math>\sigma</math> (the standard deviation of the population). The mean value calculated from the sample, <math>\bar{x}</math>, will have an associated ''standard error on the mean'', <math>{\sigma}_\bar{x}</math>, given by:<ref name=":0" />

<math display="block">{\sigma}_\bar{x} = \frac{\sigma}{\sqrt{n}}.</math>

Practically this tells us that when trying to estimate the value of a population mean, due to the factor <math>1/\sqrt{n}</math>, reducing the error on the estimate by a factor of two requires acquiring four times as many observations in the sample; reducing it by a factor of ten requires a hundred times as many observations.

=== Estimate ===
The standard deviation <math>\sigma</math> of the population being sampled is seldom known. Therefore, the standard error of the mean is usually estimated by replacing <math>\sigma</math> with the [[Standard deviation#Corrected sample standard deviation|sample standard deviation]] <math>\sigma_{x}</math> instead:
<math display="block">{\sigma}_\bar{x}\ \approx \frac{\sigma_{x}}{\sqrt{n}}.</math>

As this is only an [[estimator]] for the true "standard error", it is common to see other notations here such as:
<math display="block">\widehat{\sigma}_{\bar{x}} := \frac{\sigma_{x}}{\sqrt{n}} \qquad \text{ or } \qquad {s}_\bar{x}\ := \frac{s}{\sqrt{n}}.</math>

A common source of confusion occurs when failing to distinguish clearly between:

* the standard deviation of the ''population'' (<math>\sigma</math>),
* the standard deviation of the ''sample'' (<math>\sigma_{x}</math>),
* the standard deviation of the ''sample mean'' itself (<math>\sigma_{\bar{x}}</math>, which is the standard error), and
* the ''estimator'' of the standard deviation of the sample mean (<math>\widehat{\sigma}_{\bar{x}}</math>, which is the most often calculated quantity, and is also often colloquially called the ''standard error'').

==== Accuracy of the estimator ====
When the sample size is small, using the standard deviation of the sample instead of the true standard deviation of the population will tend to systematically underestimate the population standard deviation, and therefore also the standard error. With ''n'' = 2, the underestimate is about 25%, but for ''n'' = 6, the underestimate is only 5%. Gurland and Tripathi (1971) provide a correction and equation for this effect.<ref>{{cite journal |last=Gurland |first=J |author2=Tripathi RC | year=1971 |title=A simple approximation for unbiased estimation of the standard deviation |journal=American Statistician | volume=25 |issue=4 |pages=30–32 |doi=10.2307/2682923 |jstor=2682923 }}</ref> Sokal and Rohlf (1981) give an equation of the correction factor for small samples of ''n'' < 20.<ref>{{cite book |last1=Sokal |last2=Rohlf |year=1981 |title=Biometry: Principles and Practice of Statistics in Biological Research |edition=2nd |isbn=978-0-7167-1254-1 |page=[https://archive.org/details/biometryprincipl00soka/page/53 53] |publisher=W. H. Freeman |url-access=registration |url=https://archive.org/details/biometryprincipl00soka/page/53 }}</ref> See [[unbiased estimation of standard deviation]] for further discussion.

=== Derivation ===
The standard error on the mean may be derived from the [[variance]] of a sum of independent random variables,<ref>{{cite book | title=Essentials of Statistical Methods, in 41 pages|last=Hutchinson|first=T. P.| year=1993| publisher=Rumsby| isbn=978-0-646-12621-0|location=Adelaide}}</ref> given the [[Variance#Definition|definition]] of variance and some [[Variance#Properties|properties]] thereof. If <math> x_1, x_2 , \ldots, x_n </math> is a sample of <math>n</math> independent observations from a population with mean <math>x</math> and standard deviation <math>\sigma</math>, then we can define the total <math display="block">T = (x_1 + x_2 + \cdots + x_n)</math>, which due to the [[Bienaymé's_identity|Bienaymé formula]], will have variance

<math display="block">\operatorname{Var}(T) = \operatorname{Var}(x_1) + \operatorname{Var}(x_2) + \cdots + \operatorname{Var}(x_n) = n\sigma^2.</math>

The mean of these measurements <math>\bar{x}</math> (sample mean) is given by <math display="block">\bar{x} = T/n.</math> The variance of the mean is then

<math display="block">\operatorname{Var}(\bar{x}) = \operatorname{Var}\left(\frac{T}{n}\right) = \frac{1}{n^2}\operatorname{Var}(T) = \frac{1}{n^2}n\sigma^2 = \frac{\sigma^2}{n},</math>

where [[Variance#Addition and multiplication by a constant|a propagation in variance]] is used in the 2nd equality. The standard error is, by definition, the standard deviation of <math>\bar{x}</math> which is the square root of the variance:

<math display="block">\sigma_{\bar{x}} = \sqrt{\frac{\sigma^2}{n}} = \frac{\sigma}{\sqrt{n}} .</math>

In other words, if there are a large number of observations per sampling (<math display="inline">n</math> is high compared with the population variance <math display="inline">\sigma</math>), then the calculated mean per sample <math display="inline">\bar{x}</math> is expected to be close to the population mean <math> x </math>.

For correlated random variables, the sample variance needs to be computed according to the [[Markov chain central limit theorem]].

===Independent and identically distributed random variables with random sample size===
There are cases when a sample is taken without knowing, in advance, how many observations will be acceptable according to some criterion. In such cases, the sample size <math>N</math> is a random variable whose variation adds to the variation of <math>X</math> such that,<math display="block">\operatorname{Var}(T) = \operatorname{E}(N)\operatorname{Var}(X) + \operatorname{Var}(N)\big(\operatorname{E}(X)\big)^2</math><ref>{{ cite book | last1 = Cornell | first1 = J R | last2 = Benjamin | first2 = C A | title = Probability, Statistics, and Decisions for Civil Engineers | publisher = McGraw-Hill | location = NY | year = 1970 | isbn = 0486796094 | pages = 178–179 }}</ref>
which follows from the [[law of total variance]].

If <math>N</math> has a ''[[Poisson distribution]]'', then <math>\operatorname{E}(N)= \operatorname{Var}(N)</math> with estimator <math>n = N</math>. Hence the estimator of <math>\operatorname{Var}(T)</math> becomes <math>nS^2_X + n\bar{X}^2</math>, leading the following formula for standard error: 
<math display="block">\operatorname{Standard~Error}(\bar{X})= \sqrt{\frac{S^2_X + \bar{X}^2}{n}}</math> 
(since the standard deviation is the square root of the variance).