Editing Student's t-test (section)

==Calculations==
Explicit expressions that can be used to carry out various ''t''-tests are given below. In each case, the formula for a test statistic that either exactly follows or closely approximates a ''t''-distribution under the null hypothesis is given. Also, the appropriate [[degrees of freedom (statistics)|degrees of freedom]] are given in each case. Each of these statistics can be used to carry out either a [[One-tailed test|one-tailed or two-tailed test]].

Once the ''t'' value and degrees of freedom are determined, a [[p-value|''p''-value]] can be found using a [[Student's t-distribution#Table of selected values|table of values from Student's ''t''-distribution]]. If the calculated ''p''-value is below the threshold chosen for [[statistical significance]] (usually the 0.10, the 0.05, or 0.01 level), then the null hypothesis is rejected in favor of the alternative hypothesis.

===Slope of a regression line===
Suppose one is fitting the model

: <math> Y = \alpha + \beta x + \varepsilon, </math>

where {{math|''x''}} is known, {{math|''α''}} and {{math|''β''}} are unknown, {{math|''ε''}} is a normally distributed random variable with mean 0 and unknown variance {{math|''σ''<sup>2</sup>}}, and {{math|''Y''}} is the outcome of interest. We want to test the null hypothesis that the slope {{math|''β''}} is equal to some specified value {{math|''β''<sub>0</sub>}} (often taken to be 0, in which case the null hypothesis is that {{math|''x''}} and {{math|''y''}} are uncorrelated).

Let

: <math>
\begin{align}
 \hat\alpha, \hat\beta &= \text{least-squares estimators}, \\
 SE_{\hat\alpha}, SE_{\hat\beta} &= \text{the standard errors of least-squares estimators}.
\end{align}
</math>

Then

:<math>
 t_\text{score} = \frac{\hat\beta - \beta_0}{ SE_{\hat\beta} } \sim \mathcal{T}_{n-2}
</math>

has a ''t''-distribution with {{math|''n'' − 2}} degrees of freedom if the null hypothesis is true. The [[Simple linear regression#Normality assumption|standard error of the slope coefficient]]:

: <math>
 SE_{\hat\beta} = \frac{\sqrt{\displaystyle \frac{1}{n - 2}\sum_{i=1}^n (y_i - \hat y_i)^2}}{\sqrt{\displaystyle \sum_{i=1}^n (x_i - \bar{x})^2}}
</math>

can be written in terms of the residuals. Let

: <math>
\begin{align}
 \hat\varepsilon_i &= y_i - \hat y_i = y_i - (\hat\alpha + \hat\beta x_i) = \text{residuals} = \text{estimated errors}, \\
 \text{SSR} &= \sum_{i=1}^n {\hat\varepsilon_i}^2 = \text{sum of squares of residuals}.
\end{align}
</math>

Then {{math|''t''}}<sub>score</sub> is given by

: <math> t_\text{score} = \frac{(\hat\beta - \beta_0) \sqrt{n-2}}{\sqrt{\frac{SSR}{\sum_{i=1}^n (x_i - \bar{x})^2}}}. </math>

Another way to determine the {{math|''t''}}<sub>score</sub> is

: <math> t_\text{score} = \frac{r\sqrt{n - 2}}{\sqrt{1 - r^2}}, </math>

where ''r'' is the [[Pearson correlation coefficient]].

The {{math|''t''}}<sub>score, intercept</sub> can be determined from the {{math|''t''}}<sub>score, slope</sub>:

: <math> t_\text{score,intercept} = \frac{\alpha}{\beta} \frac{t_\text{score,slope}}{\sqrt{s_\text{x}^2 + \bar{x}^2}}, </math>

where {{math|''s''<sub>x</sub><sup>2</sup>}} is the sample variance.

===Independent two-sample ''t''-test===

====Equal sample sizes and variance====
Given two groups (1, 2), this test is only applicable when:
* the two sample sizes are equal,
* it can be assumed that the two distributions have the same variance.
Violations of these assumptions are discussed below.

The {{math|''t''}} statistic to test whether the means are different can be calculated as follows:

: <math> t = \frac{\bar{X}_1 - \bar{X}_2}{s_p \sqrt\frac{2}{n}}, </math>

where
: <math> s_p = \sqrt{\frac{s_{X_1}^2 + s_{X_2}^2}{2}}.</math>

Here {{math|''s<sub>p</sub>''}} is the [[pooled standard deviation]] for {{math|1=''n'' = ''n''<sub>1</sub> = ''n''<sub>2</sub>}}, and {{math|''s''{{su|b=''X''<sub>1</sub>|p=&nbsp;2}}}} and {{math|''s''{{su|b=''X''<sub>2</sub>|p=&nbsp;2}}}} are the [[unbiased estimator]]s of the population variance. The denominator of {{math|''t''}} is the [[Standard error (statistics)|standard error]] of the difference between two means.

For significance testing, the [[Degrees of freedom (statistics)|degrees of freedom]] for this test is {{math|2''n'' − 2}}, where {{math|''n''}} is sample size.

====Equal or unequal sample sizes, similar variances ({{sfrac|1|2}} &lt; {{sfrac|''s''<sub>''X''<sub>1</sub></sub>|''s''<sub>''X''<sub>2</sub></sub>}} &lt; 2)====
This test is used only when it can be assumed that the two distributions have the same variance (when this assumption is violated, see below). 
The previous formulae are a special case of the formulae below, one recovers them when both samples are equal in size: {{math|1=''n'' = ''n''<sub>1</sub> = ''n''<sub>2</sub>}}.

The {{math|''t''}} statistic to test whether the means are different can be calculated as follows:
: <math>t = \frac{\bar{X}_1 - \bar{X}_2}{s_p \cdot \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}},</math>

where

: <math> s_p = \sqrt{\frac{(n_1 - 1)s_{X_1}^2 + (n_2 - 1)s_{X_2}^2}{n_1 + n_2-2}}</math>

is the [[pooled standard deviation]] of the two samples: it is defined in this way so that its square is an [[unbiased estimator]] of the common variance, whether or not the population means are the same. In these formulae, {{math|''n<sub>i</sub>''&nbsp;−&nbsp;1}} is the number of degrees of freedom for each group, and the total sample size minus two (that is, {{math|''n''<sub>1</sub>&nbsp;+&nbsp;''n''<sub>2</sub>&nbsp;−&nbsp;2}}) is the total number of degrees of freedom, which is used in significance testing.

The [[Minimum Detectable Effect|minimum detectable effect]] (MDE) is:<ref>[https://webspace.ship.edu/pgmarr/Geo441/Examples/Minimum%20Detectable%20Difference.pdf Minimum Detectable Difference for Two-Sample t-Test for Means. Equation and example adapted from Zar, 1984 ]</ref>

<math>\delta \ge \sqrt{\frac{2S_p^2}{n}}(t_{1-\alpha, \nu} + t_{1-\beta, \nu})</math>

==== Equal or unequal sample sizes, unequal variances (''s''<sub>''X''<sub>1</sub></sub> &gt; 2''s''<sub>''X''<sub>2</sub></sub> or ''s''<sub>''X''<sub>2</sub></sub> &gt; 2''s''<sub>''X''<sub>1</sub></sub>) ====
{{main|Welch's t test{{!}}Welch's ''t''-test}}
This test, also known as Welch's ''t''-test, is used only when the two population variances are not assumed to be equal (the two sample sizes may or may not be equal) and hence must be estimated separately. The {{math|''t''}} statistic to test whether the population means are different is calculated as

: <math>t = \frac{\bar{X}_1 - \bar{X}_2}{s_{\bar\Delta}},</math>

where

: <math>s_{\bar\Delta} = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}.</math>

Here {{math|''s<sub>i</sub>''<sup>2</sup>}} is the [[unbiased estimator]] of the [[variance]] of each of the two samples with {{math|''n<sub>i</sub>''}} = number of participants in group {{math|''i''}} ({{math|''i''}} = 1 or 2). In this case <math>(s_{\bar\Delta})^2</math>
is not a pooled variance. For use in significance testing, the distribution of the test statistic is approximated as an ordinary Student's ''t''-distribution with the degrees of freedom calculated using

: <math> \text{d.f.} = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{(s_1^2/n_1)^2}{n_1 - 1} + \frac{(s_2^2/n_2)^2}{n_2 - 1}}.
</math>

This is known as the [[Welch–Satterthwaite equation]]. The true distribution of the test statistic actually depends (slightly) on the two unknown population variances (see [[Behrens–Fisher problem]]).

===Exact method for unequal variances and sample sizes===

The test<ref>{{cite arXiv | eprint=2210.16473 | last1=Wang | first1=Chang | last2=Jia | first2=Jinzhu | title=Te Test: A New Non-asymptotic T-test for Behrens-Fisher Problems | year=2022 | class=math.ST }}</ref> deals with the famous [[Behrens–Fisher problem]], i.e., comparing the difference between the means of two normally distributed populations when the variances of the two populations are not assumed to be equal, based on two independent samples.

The test is developed as an [[exact test]] that allows for '''unequal sample sizes''' and '''unequal variances''' of two populations. The exact property still holds even with '''extremely small''' and '''unbalanced sample sizes''' (e.g. <math>\ m \equiv n_\mathsf{X} = 50\ </math> vs. <math>\  n \equiv n_\mathsf{Y} = 5\ </math>).

The statistic to test whether the means are different can be calculated as follows:

Let <math>\ X = \left[\ X_1, X_2, \ldots, X_m\ \right]^\top\ </math> and <math>\ Y = \left[\ Y_1, Y_2, \ldots, Y_n\ \right]^\top\ </math> be the i.i.d. sample vectors (for <math>\ m \ge n\ </math>) from <math>\ \mathsf{Norm}\left(\ \mu_\mathsf{X},\ \sigma_\mathsf{X}^2\ \right)\ </math> and <math>\ \mathsf{Norm}\left(\ \mu_\mathsf{Y},\ \sigma_\mathsf{Y}^2\ \right)\ </math> separately.

Let <math>\ (P^\top)_{n\times n}\ </math> be an <math>n\times n</math> orthogonal matrix whose elements of the first row are all <math>\ \tfrac{ 1 }{ \sqrt{ n\ } }\ ,</math> similarly, let <math>\ (Q^\top)_{n\times m}\ </math> be the first <math>\ n\ </math> rows of an <math>\ m\times m\ </math> orthogonal matrix (whose elements of the first row are all <math>\ \tfrac{ 1 }{ \sqrt{ m\ } }\ </math>).

Then <math>\ Z \equiv \frac{\ \left( Q^\top \right)_{n\times m}\ X\ }{ \sqrt{ m\ } }\ -\ \frac{\ \left( P^\top \right)_{n\times n}\ Y\ }{ \sqrt{ n\ } }\ </math> is an {{mvar|n}}-dimensional normal random vector:

:<math> Z ~\sim~ \mathsf{Norm}\left(\ \left[\ \mu_\mathsf{X} - \mu_\mathsf{Y},\ 0,\ 0,\ \ldots,\ 0\ \right]^\top\ ,\ \left( \frac{\ \sigma_\mathsf{X}^2\ }{ m } + \frac{\ \sigma_\mathsf{Y}^2\ }{ n }\right)\ I_n\ \right) ~.</math>

From the above distribution we see that the first element of the vector {{mvar|Z}} is

:<math> Z_1 = \bar X - \bar Y = \frac{ 1 }{\ m\ } \sum_{i=1}^m\ X_i - \frac{ 1 }{\ n\ } \sum_{j=1}^n\ Y_j\ ,</math>

hence the first element is distributed as

:<math> Z_1 - \left( \mu_\mathsf{X} - \mu_\mathsf{Y} \right) ~\sim~ \mathsf{Norm}\left(\ 0,\ \frac{\ \sigma_\mathsf{X}^2\ }{ m } + \frac{\ \sigma_\mathsf{Y}^2\ }{ n }\ \right)\ ,</math>

and the squares of the remaining elements of {{mvar|Z}}  are [[chi-square distribution|chi-squared]] distributed

:<math> \frac{\ \sum_{i=2}^n Z^2_i\ }{\ n - 1\ } ~\sim~ \frac{\ \chi^2_{n - 1}\ }{\ n - 1\ } \times\left( \frac{\ \sigma_\mathsf{X}^2\ }{ m }+\frac{\ \sigma_\mathsf{Y}^2\ }{ n } \right) </math>

and by construction of the orthogonal matricies {{mvar|P}} and {{mvar|Q}} we have

:<math> Z_1 - \left( \mu_\mathsf{X} - \mu_\mathsf{Y} \right) \quad \perp \quad \sum_{i=2}^n Z^2_i\ ,</math>

so {{mvar|Z}}{{sub|1}}, the first element of {{mvar|Z}}, is statistically independent of the remaining elements by orthogonality.
Finally, take for the test statistic

:<math> T_\mathsf{e} ~\equiv~ \frac{\ Z_1 - \left( \mu_\mathsf{X} - \mu_\mathsf{Y} \right)\ }{\ \sqrt{ \left( \sum_{i=2}^{n} Z^2_i \right) /\left( n - 1 \right)\ }\ } ~\sim~ t_{n - 1} ~.</math>

===Dependent ''t''-test for paired samples===
This test is used when the samples are dependent; that is, when there is only one sample that has been tested twice (repeated measures) or when there are two samples that have been matched or "paired". This is an example of a [[paired difference test]]. The ''t'' statistic is calculated as

: <math>t = \frac{\bar{X}_D - \mu_0}{s_D/\sqrt n}, </math>

where <math>\bar{X}_D</math> and <math>s_D</math> are the average and standard deviation of the differences between all pairs. The pairs are e.g. either one person's pre-test and post-test scores or between-pairs of persons matched into meaningful groups (for instance, drawn from the same family or age group: see table). The constant {{math|''μ''<sub>0</sub>}} is zero if we want to test whether the average of the difference is significantly different. The degree of freedom used is {{math|''n'' − 1}}, where {{math|''n''}} represents the number of pairs.
: {|
|- style="vertical-align:bottom"
|style="padding-right:2em"|
{| class="wikitable"
|+ Example of matched pairs
|-
! Pair !! Name !! Age !! Test
|-
| 1 || John || 35 || 250
|-
| 1 || Jane || 36 || 340
|-
| 2 || Jimmy || 22 || 460
|-
| 2 || Jessy || 21 || 200
|}
|
{| align="right" class="wikitable"
|+ Example of repeated measures
|-
! Number !! Name !! Test 1 !! Test 2
|-
| 1 || Mike || 35% || 67%
|-
| 2 || Melanie || 50% || 46%
|-
| 3 || Melissa|| 90% || 86%
|-
| 4 || Mitchell || 78% || 91%
|}
|}