Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Student's t-distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Properties== ===Moments=== For <math>\nu > 1\ ,</math> the [[raw moment]]s of the {{mvar|t}} distribution are :<math>\operatorname{\mathbb E}\left\{\ T^k\ \right\} = \begin{cases} \quad 0 & k \text{ odd }, \quad 0 < k < \nu\ , \\ {} \\ \frac{1}{\ \sqrt{\pi\ }\ \Gamma\left(\frac{\ \nu\ }{ 2 }\right)}\ \left[\ \Gamma\!\left(\frac{\ k + 1\ }{ 2 }\right)\ \Gamma\!\left(\frac{\ \nu - k\ }{ 2 }\right)\ \nu^{\frac{\ k\ }{ 2 }}\ \right] & k \text{ even }, \quad 0 < k < \nu ~.\\ \end{cases}</math> Moments of order <math>\ \nu\ </math> or higher do not exist.<ref>{{cite book |vauthors=Casella G, Berger RL |year=1990 |title=Statistical Inference |publisher=Duxbury Resource Center |isbn=9780534119584 |page =56}}</ref> The term for <math>\ 0 < k < \nu\ ,</math> {{mvar|k}} even, may be simplified using the properties of the [[gamma function]] to :<math>\operatorname{\mathbb E}\left\{\ T^k\ \right\} = \nu^{ \frac{\ k\ }{ 2 } }\ \prod_{j=1}^{k/2}\ \frac{~ 2j - 1 ~}{ \nu - 2j } \qquad k \text{ even}, \quad 0 < k < \nu ~.</math> For a {{mvar|t}} distribution with <math>\ \nu\ </math> degrees of freedom, the [[expected value]] is <math>\ 0\ </math> if <math>\ \nu > 1\ ,</math> and its [[variance]] is <math>\ \frac{ \nu }{\ \nu-2\ }\ </math> if <math>\ \nu > 2 ~.</math> The [[skewness]] is 0 if <math>\ \nu > 3\ </math> and the [[excess kurtosis]] is <math>\ \frac{ 6 }{\ \nu - 4\ }\ </math> if <math>\ \nu > 4 ~.</math> ===How the {{mvar|t}} distribution arises (characterization) {{anchor|Characterization}}=== ====As the distribution of a test statistic==== Student's ''t''-distribution with <math>\nu</math> degrees of freedom can be defined as the distribution of the [[random variable]] ''T'' with<ref name="JKB">{{Cite book|title=Continuous Univariate Distributions|vauthors=Johnson NL, Kotz S, Balakrishnan N|publisher=Wiley|year=1995|isbn=9780471584940|edition=2nd|volume=2|chapter=Chapter 28}}</ref><ref name="Hogg">{{cite book|title=Introduction to Mathematical Statistics|vauthors=Hogg RV, Craig AT|publisher=Macmillan|year=1978|edition=4th|location=New York|asin=B010WFO0SA|postscript=. Sections 4.4 and 4.8|author-link=Robert V. Hogg}}</ref> :<math> T=\frac{Z}{\sqrt{V/\nu}} = Z \sqrt{\frac{\nu}{V}},</math> where * ''Z'' is a standard normal with [[expected value]] 0 and variance 1; * ''V'' has a [[chi-squared distribution]] ({{nowrap|1=<span style="font-family:serif">''χ''</span><sup>2</sup>-distribution}}) with <math>\nu</math> [[Degrees of freedom (statistics)|degrees of freedom]]; * ''Z'' and ''V'' are [[statistical independence|independent]]; A different distribution is defined as that of the random variable defined, for a given constant ''μ'', by :<math>(Z+\mu)\sqrt{\frac{\nu}{V}}.</math> This random variable has a [[noncentral t-distribution|noncentral ''t''-distribution]] with [[noncentrality parameter]] ''μ''. This distribution is important in studies of the [[statistical power|power]] of Student's ''t''-test. =====Derivation===== Suppose ''X''<sub>1</sub>, ..., ''X''<sub>''n''</sub> are [[statistical independence|independent]] realizations of the normally-distributed, random variable ''X'', which has an expected value ''μ'' and [[variance]] ''σ''<sup>2</sup>. Let :<math>\overline{X}_n = \frac{1}{n}(X_1+\cdots+X_n)</math> be the sample mean, and :<math>s^2 = \frac{1}{n-1} \sum_{i=1}^n \left(X_i - \overline{X}_n\right)^2</math> be an unbiased estimate of the variance from the sample. It can be shown that the random variable : <math>V = (n-1)\frac{s^2}{\sigma^2} </math> has a chi-squared distribution with <math>\nu = n - 1</math> degrees of freedom (by [[Cochran's theorem]]).<ref>{{cite journal|authorlink1=William Gemmell Cochran | last1=Cochran |first1=W. G.|date=1934|title=The distribution of quadratic forms in a normal system, with applications to the analysis of covariance|journal=[[Mathematical Proceedings of the Cambridge Philosophical Society]]|volume=30|issue=2|pages=178–191|bibcode=1934PCPS...30..178C|doi=10.1017/S0305004100016595|s2cid=122547084 }}</ref> It is readily shown that the quantity :<math>Z = \left(\overline{X}_n - \mu\right) \frac{\sqrt{n}}{\sigma}</math> is normally distributed with mean 0 and variance 1, since the sample mean <math>\overline{X}_n</math> is normally distributed with mean ''μ'' and variance ''σ''<sup>2</sup>/''n''. Moreover, it is possible to show that these two random variables (the normally distributed one ''Z'' and the chi-squared-distributed one ''V'') are independent. Consequently{{clarify|date=November 2012}} the [[pivotal quantity]] :<math display="inline">T \equiv \frac{Z}{\sqrt{V/\nu}} = \left(\overline{X}_n - \mu\right) \frac{\sqrt{n}}{s},</math> which differs from ''Z'' in that the exact standard deviation ''σ'' is replaced by the sample standard error ''s'', has a Student's ''t''-distribution as defined above. Notice that the unknown population variance ''σ''<sup>2</sup> does not appear in ''T'', since it was in both the numerator and the denominator, so it canceled. Gosset intuitively obtained the probability density function stated above, with <math>\nu</math> equal to ''n'' − 1, and Fisher proved it in 1925.<ref name="Fisher 1925 90–104"/> The distribution of the test statistic ''T'' depends on <math>\nu</math>, but not ''μ'' or ''σ''; the lack of dependence on ''μ'' and ''σ'' is what makes the ''t''-distribution important in both theory and practice. ====Sampling distribution of t-statistic==== The {{mvar|t}} distribution arises as the sampling distribution of the {{mvar|t}} statistic. Below the one-sample {{mvar|t}} statistic is discussed, for the corresponding two-sample {{mvar|t}} statistic see [[Student's t-test]]. =====Unbiased variance estimate===== Let <math>\ x_1, \ldots, x_n \sim {\mathcal N}(\mu, \sigma^2)\ </math> be independent and identically distributed samples from a normal distribution with mean <math>\mu</math> and variance <math>\ \sigma^2 ~.</math> The sample mean and unbiased [[sample variance]] are given by: : <math> \begin{align} \bar{x} &= \frac{\ x_1+\cdots+x_n\ }{ n }\ , \\[5pt] s^2 &= \frac{ 1 }{\ n-1\ }\ \sum_{i=1}^n (x_i - \bar{x})^2 ~. \end{align} </math> The resulting (one sample) {{mvar|t}} statistic is given by : <math> t = \frac{\bar{x} - \mu}{\ s / \sqrt{n \ }\ } \sim t_{n - 1} ~.</math> and is distributed according to a Student's {{mvar|t}} distribution with <math>\ n - 1\ </math> degrees of freedom. Thus for inference purposes the {{mvar|t}} statistic is a useful "[[pivotal quantity]]" in the case when the mean and variance <math>(\mu, \sigma^2)</math> are unknown population parameters, in the sense that the {{mvar|t}} statistic has then a probability distribution that depends on neither <math>\mu</math> nor <math>\ \sigma^2 ~.</math> =====ML variance estimate===== Instead of the unbiased estimate <math>\ s^2\ </math> we may also use the maximum likelihood estimate :<math>\ s^2_\mathsf{ML} = \frac{\ 1\ }{ n }\ \sum_{i=1}^n (x_i - \bar{x})^2\ </math> yielding the statistic : <math>\ t_\mathsf{ML} = \frac{\bar{x} - \mu}{\sqrt{s^2_\mathsf{ML}/n\ }} = \sqrt{\frac{n}{n-1}\ }\ t ~.</math> This is distributed according to the location-scale {{mvar|t}} distribution: : <math> t_\mathsf{ML} \sim \operatorname{\ell st}(0,\ \tau^2=n/(n-1),\ n-1) ~.</math> ====Compound distribution of normal with inverse gamma distribution==== The location-scale {{mvar|t}} distribution results from [[compound distribution|compounding]] a [[Normal distribution|Gaussian distribution]] (normal distribution) with [[mean]] <math>\ \mu\ </math> and unknown [[variance]], with an [[inverse gamma distribution]] placed over the variance with parameters <math>\ a = \frac{\ \nu\ }{ 2 }\ </math> and <math>b = \frac{\ \nu\ \tau^2\ }{ 2 } ~.</math> In other words, the [[random variable]] ''X'' is assumed to have a Gaussian distribution with an unknown variance distributed as inverse gamma, and then the variance is [[marginalized out]] (integrated out). Equivalently, this distribution results from compounding a Gaussian distribution with a [[scaled-inverse-chi-squared distribution]] with parameters <math>\nu</math> and <math>\ \tau^2 ~.</math> The scaled-inverse-chi-squared distribution is exactly the same distribution as the inverse gamma distribution, but with a different parameterization, i.e. <math>\ \nu = 2\ a, \; {\tau}^2 = \frac{\ b\ }{ a } ~.</math> The reason for the usefulness of this characterization is that in [[Bayesian statistics]] the inverse gamma distribution is the [[conjugate prior]] distribution of the variance of a Gaussian distribution. As a result, the location-scale {{mvar|t}} distribution arises naturally in many Bayesian inference problems.<ref>{{Cite book |title=Bayesian Data Analysis |vauthors=Gelman AB, Carlin JS, Rubin DB, Stern HS |publisher=Chapman & Hal l|year=1997 |isbn=9780412039911 |edition=2nd |location=Boca Raton, FL |pages=68 }}</ref> ====Maximum entropy distribution==== Student's {{mvar|t}} distribution is the [[maximum entropy probability distribution]] for a random variate ''X'' having a certain value of <math>\ \operatorname{\mathbb E}\left\{\ \ln(\nu+X^2)\ \right\}\ </math>.<ref>{{cite journal|vauthors=Park SY, Bera AK|date=2009|title=Maximum entropy autoregressive conditional heteroskedasticity model|journal=[[Journal of Econometrics]]|volume=150|issue=2|pages=219–230|doi=10.1016/j.jeconom.2008.12.014}}</ref> {{Clarify|reason=It is not clear what is meant by "fixed" in this context. An older and more to-the-point source ( https://link.springer.com/content/pdf/10.1007/BF02481032.pdf ) demonstrates that the Student's t distribution with {{mvar|ν}} d.o.f. is the maximum entropy solution to a specific problem, for which, in addition to one more constraint, ℰ{ ln( 1 + X²/ν)} equals some constant which is predetermined for every {{mvar|ν}}.|date=December 2020}}{{Better source needed|date=December 2020|reason=The source does not obviously state this, although it touches upon something related.}} This follows immediately from the observation that the pdf can be written in [[exponential family]] form with <math>\nu+X^2</math> as sufficient statistic. ===Integral of Student's probability density function and {{mvar|p}}-value=== The function {{nobr|{{math|''A''(''t'' {{!}} ''ν'')}} }} is the integral of Student's probability density function, {{math|''f''(''t'')}} between {{mvar|-t}} and {{mvar|t}}, for {{nobr|{{math| ''t'' ≥ 0 }} .}} It thus gives the probability that a value of ''t'' less than that calculated from observed data would occur by chance. Therefore, the function {{nobr|{{math|''A''(''t'' {{!}} ''ν'')}} }} can be used when testing whether the difference between the means of two sets of data is statistically significant, by calculating the corresponding value of {{mvar|t}} and the probability of its occurrence if the two sets of data were drawn from the same population. This is used in a variety of situations, particularly in [[t test|{{mvar|t}} tests]]. For the statistic {{mvar|t}}, with {{mvar|ν}} degrees of freedom, {{nobr|{{math|''A''(''t'' {{!}} ''ν'')}} }} is the probability that {{mvar|t}} would be less than the observed value if the two means were the same (provided that the smaller mean is subtracted from the larger, so that {{nobr|{{math| ''t'' ≥ 0}} ).}} It can be easily calculated from the [[cumulative distribution function]] {{math|''F''{{sub|''ν''}}(''t'')}} of the {{mvar|t}} distribution: :<math> A( t \mid \nu) = F_\nu(t) - F_\nu(-t) = 1 - I_{ \frac{\nu}{\nu +t^2} }\!\left(\frac{\nu}{2},\frac{1}{2}\right),</math> where {{nobr| {{math| ''I{{sub|x}}''(''a'', ''b'') }} }} is the regularized [[Beta function#Incomplete beta function|incomplete beta function]]. For statistical hypothesis testing this function is used to construct the [[p-value|''p''-value]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)