Editing Student's t-distribution (section)

===In frequentist statistical inference===
Student's {{mvar|t}}&nbsp;distribution arises in a variety of statistical estimation problems where the goal is to estimate an unknown parameter, such as a mean value, in a setting where the data are observed with additive [[errors and residuals in statistics|errors]]. If (as in nearly all practical statistical work) the population [[standard deviation]] of these errors is unknown and has to be estimated from the data, the {{mvar|t}}&nbsp;distribution is often used to account for the extra uncertainty that results from this estimation. In most such problems, if the standard deviation of the errors were known, a normal distribution would be used instead of the {{mvar|t}}&nbsp;distribution.

[[Confidence interval]]s and [[hypothesis test]]s are two statistical procedures in which the [[quantile]]s of the sampling distribution of a particular statistic (e.g. the [[standard score]]) are required. In any situation where this statistic is a [[linear function]] of the [[data]], divided by the usual estimate of the standard deviation, the resulting quantity can be rescaled and centered to follow Student's {{mvar|t}}&nbsp;distribution. Statistical analyses involving means, weighted means, and regression coefficients all lead to statistics having this form.

Quite often, textbook problems will treat the population standard deviation as if it were known and thereby avoid the need to use the Student's {{mvar|t}}&nbsp;distribution. These problems are generally of two kinds: (1) those in which the sample size is so large that one may treat a data-based estimate of the [[variance]] as if it were certain, and (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor is then explaining.

====Hypothesis testing====
A number of statistics can be shown to have {{mvar|t}}&nbsp;distributions for samples of moderate size under [[null hypothesis|null hypotheses]] that are of interest, so that the {{mvar|t}}&nbsp;distribution forms the basis for significance tests. For example, the distribution of [[Spearman's rank correlation coefficient]] {{mvar|ρ}}, in the null case (zero correlation) is well approximated by the {{mvar|t}} distribution for sample sizes above about 20.{{citation needed|date=November 2010}}

====Confidence intervals====
Suppose the number ''A'' is so chosen that

:<math>\ \operatorname{\mathbb P}\left\{\ -A < T < A\ \right\} = 0.9\ ,</math>

when {{mvar|T}} has a {{mvar|t}}&nbsp;distribution with {{nobr|{{math|''n'' − 1}} &thinsp;}} degrees of freedom. By symmetry, this is the same as saying that {{mvar|A}} satisfies

:<math>\ \operatorname{\mathbb P}\left\{\ T < A\ \right\} = 0.95\ ,</math>

so ''A'' is the "95th percentile" of this probability distribution, or <math>\ A = t_{(0.05,n-1)} ~.</math> Then

:<math>\ \operatorname{\mathbb P}\left\{\ -A < \frac{\ \overline{X}_n - \mu\ }{ S_n/\sqrt{n\ } } < A\ \right\} = 0.9\ ,</math>

where {{nobr|''S''{{sub|''n''}} }} is the sample standard deviation of the observed values. This is equivalent to

:<math>\ \operatorname{\mathbb P}\left\{\ \overline{X}_n - A \frac{ S_n }{\ \sqrt{n\ }\ } < \mu < \overline{X}_n + A\ \frac{ S_n }{\ \sqrt{n\ }\ }\ \right\} = 0.9.</math>

Therefore, the interval whose endpoints are

:<math>\ \overline{X}_n\ \pm A\ \frac{ S_n }{\ \sqrt{n\ }\ }\ </math>

is a 90% [[confidence interval]] for μ. Therefore, if we find the mean of a set of observations that we can reasonably expect to have a normal distribution, we can use the {{mvar|t}}&nbsp;distribution to examine whether the confidence limits on that mean include some theoretically predicted value – such as the value predicted on a [[null hypothesis]].

It is this result that is used in the [[Student's t-test|Student's {{mvar|t}}&nbsp;test]]s: since the difference between the means of samples from two normal distributions is itself distributed normally, the {{mvar|t}}&nbsp;distribution can be used to examine whether that difference can reasonably be supposed to be zero.

If the data are normally distributed, the one-sided {{nobr|{{math|(1 − ''α'')}} upper}} confidence limit (UCL) of the mean, can be calculated using the following equation:

:<math>\mathsf{UCL}_{1-\alpha} = \overline{X}_n + t_{\alpha,n-1}\ \frac{ S_n }{\ \sqrt{n\ }\ } ~.</math>
 
The resulting UCL will be the greatest average value that will occur for a given confidence interval and population size. In other words, <math>\overline{X}_n</math> being the mean of the set of observations, the probability that the mean of the distribution is inferior to {{nobr|UCL{{sub|{{math|1 − ''α''}} }} }} is equal to the confidence {{nobr|level {{math|1 − ''α''}} .}}

====Prediction intervals====
The {{mvar|t}}&nbsp;distribution can be used to construct a [[prediction interval]] for an unobserved sample from a normal distribution with unknown mean and variance.