Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Errors and residuals
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==In univariate distributions== If we assume a normally distributed population with mean μ and [[standard deviation]] σ, and choose individuals independently, then we have :<math>X_1, \dots, X_n \sim N\left(\mu, \sigma^2\right)\,</math> and the [[arithmetic mean|sample mean]] :<math>\overline{X}={X_1 + \cdots + X_n \over n}</math> is a random variable distributed such that: :<math>\overline{X} \sim N \left(\mu, \frac {\sigma^2} n \right).</math> The ''statistical errors'' are then :<math>e_i = X_i - \mu,\,</math> with [[Expected Value|expected]] values of zero,<ref>{{Cite book|title=Intermediate statistical methods|last=Wetherill, G. Barrie.|date=1981|publisher=Chapman and Hall|isbn=0-412-16440-X|location=London|oclc=7779780|url-access=registration|url=https://archive.org/details/intermediatestat0000weth}}</ref> whereas the ''residuals'' are :<math>r_i = X_i - \overline{X}.</math> The sum of squares of the '''statistical errors''', divided by ''σ''<sup>2</sup>, has a [[chi-squared distribution]] with ''n'' [[Degrees of freedom (statistics)|degrees of freedom]]: : <math>\frac 1 {\sigma^2}\sum_{i=1}^n e_i^2\sim\chi^2_n.</math> However, this quantity is not observable as the population mean is unknown. The sum of squares of the '''residuals''', on the other hand, is observable. The quotient of that sum by σ<sup>2</sup> has a chi-squared distribution with only ''n'' − 1 degrees of freedom: :<math> \frac 1 {\sigma^2} \sum_{i=1}^n r_i^2 \sim \chi^2_{n-1}. </math> This difference between ''n'' and ''n'' − 1 degrees of freedom results in [[Bessel's correction]] for the estimation of [[sample variance]] of a population with unknown mean and unknown variance. No correction is necessary if the population mean is known. ===Remark=== It is remarkable that the [[Squared deviations|sum of squares of the residuals]] and the sample mean can be shown to be independent of each other, using, e.g. [[Basu's theorem]].<!-- Basu's theorem is definitely overkill in this case. It can be proved by far simpler methods. --> That fact, and the normal and chi-squared distributions given above form the basis of calculations involving the t-statistic: :<math> T = \frac{\overline{X}_n - \mu_0}{S_n/\sqrt{n}}, </math> where <math>\overline{X}_n - \mu_0</math> represents the errors, <math>S_n</math> represents the sample standard deviation for a sample of size ''n'', and unknown ''σ'', and the denominator term <math>S_n/\sqrt n</math> accounts for the standard deviation of the errors according to:<ref name="modernintro">{{Cite book|title=A modern introduction to probability and statistics : understanding why and how|date=2005-06-15|publisher=Springer London|author1=Frederik Michel Dekking|author2=Cornelis Kraaikamp|author3=Hendrik Paul Lopuhaä|author4=Ludolf Erwin Meester|isbn=978-1-85233-896-1|location=London|oclc=262680588}}</ref> <math display="block">\operatorname{Var}\left(\overline{X}_n\right) = \frac{\sigma^2} n</math> The [[probability distribution]]s of the numerator and the denominator separately depend on the value of the unobservable population standard deviation ''σ'', but ''σ'' appears in both the numerator and the denominator and cancels. That is fortunate because it means that even though we do not know ''σ'', we know the probability distribution of this quotient: it has a [[Student's t-distribution]] with ''n'' − 1 degrees of freedom. We can therefore use this quotient to find a [[confidence interval]] for ''μ''. This t-statistic can be interpreted as "the number of standard errors away from the regression line."<ref>{{Cite book|title=Practical statistics for data scientists : 50 essential concepts|author1=Peter Bruce|author2=Andrew Bruce|isbn=978-1-4919-5296-2|edition=First|publisher=O'Reilly Media Inc|location=Sebastopol, CA|oclc=987251007|date=2017-05-10}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)