Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Regression analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==History== The earliest regression form was seen in [[Isaac Newton]]'s work in 1700 while studying [[Equinox|equinoxes]], being credited with introducing "an embryonic linear aggression analysis" as "Not only did he perform the averaging of a set of data, 50 years before [[Tobias Mayer]], but summing the residuals to zero he ''forced'' the regression line to pass through the average point. He also distinguished between two inhomogeneous sets of data and might have thought of an ''optimal'' solution in terms of bias, though not in terms of effectiveness." He previously used an averaging method in his 1671 work on Newton's rings, which was unprecedented at the time.<ref>{{cite arXiv |eprint=0810.4948 |class=physics.hist-ph |first1=Ari |last1=Belenkiy |first2=Eduardo Vila |last2=Echague |title=Groping Toward Linear Regression Analysis: Newton's Analysis of Hipparchus' Equinox Observations |date=2008}}</ref><ref>{{Cite book |last1=Buchwald |first1=Jed Z. |title=Newton and the Origin of Civilization |last2=Feingold |first2=Mordechai |date=2013 |publisher=[[Princeton University Press]] |isbn=978-0-691-15478-7 |location= |pages=90–93, 101–103}}</ref> The [[method of least squares]] was published by [[Adrien-Marie Legendre|Legendre]] in 1805,<ref name="Legendre">[[Adrien-Marie Legendre|A.M. Legendre]]. [https://books.google.com/books?id=FRcOAAAAQAAJ ''Nouvelles méthodes pour la détermination des orbites des comètes''], Firmin Didot, Paris, 1805. "Sur la Méthode des moindres quarrés" appears as an appendix.</ref> and by [[Carl Friedrich Gauss|Gauss]] in 1809.<ref name="Gauss">Chapter 1 of: Angrist, J. D., & Pischke, J. S. (2008). ''Mostly Harmless Econometrics: An Empiricist's Companion''. Princeton University Press.</ref> Legendre and Gauss both applied the method to the problem of determining, from astronomical observations, the orbits of bodies about the Sun (mostly comets, but also later the then newly discovered minor planets<!-- Legendre's first example is applied to [[C/1769 P1]] (Messier) -->). Gauss published a further development of the theory of least squares in 1821,<ref name="Gauss2">{{cite book|author-first1=C.F. |author-last1=Gauss|author-link=Carl Friedrich Gauss|url=https://books.google.com/books?id=ZQ8OAAAAQAAJ&q=Theoria+combinationis+observationum+erroribus+minimis+obnoxiae|title=Theoria combinationis observationum erroribus minimis obnoxiae|year=1821–1823|via=Google Books}}</ref> including a version of the [[Gauss–Markov theorem]]. The term "regression" was coined by [[Francis Galton]] in the 19th century to describe a biological phenomenon. The phenomenon was that the heights of descendants of tall ancestors tend to regress down towards a normal average (a phenomenon also known as [[regression toward the mean]]).<ref> {{cite book | last = Mogull | first = Robert G. | title = Second-Semester Applied Statistics | publisher = Kendall/Hunt Publishing Company | year = 2004 | page = 59 | isbn = 978-0-7575-1181-3 }}</ref><ref>{{cite journal | last=Galton | first=Francis | journal=Statistical Science | year=1989 | title=Kinship and Correlation (reprinted 1989) | volume=4 | jstor=2245330 | pages=80–86 | issue=2 | doi=10.1214/ss/1177012581| doi-access=free }}</ref> For Galton, regression had only this biological meaning,<ref>[[Francis Galton]]. "Typical laws of heredity", Nature 15 (1877), 492–495, 512–514, 532–533. ''(Galton uses the term "reversion" in this paper, which discusses the size of peas.)''</ref><ref>Francis Galton. Presidential address, Section H, Anthropology. (1885) ''(Galton uses the term "regression" in this paper, which discusses the height of humans.)''</ref> but his work was later extended by [[Udny Yule]] and [[Karl Pearson]] to a more general statistical context.<ref>{{cite journal | doi=10.2307/2979746 | last=Yule | first=G. Udny | author-link=G. Udny Yule | title=On the Theory of Correlation | journal=Journal of the Royal Statistical Society | year= 1897 | pages=812–54 | jstor=2979746 | volume=60 | issue=4 | url=https://zenodo.org/record/1449703 }}</ref><ref>{{cite journal | doi=10.1093/biomet/2.2.211 | author-link=Karl Pearson | last=Pearson | first=Karl |author2=Yule, G.U. |author3=Blanchard, Norman |author4= Lee, Alice | title=The Law of Ancestral Heredity | journal=[[Biometrika]] | year=1903 | jstor=2331683 | pages=211–236 | volume=2 | issue=2 | url=https://zenodo.org/record/1431601 }}</ref> In the work of Yule and Pearson, the [[joint distribution]] of the response and explanatory variables is assumed to be [[Normal distribution|Gaussian]]. This assumption was weakened by [[Ronald A. Fisher|R.A. Fisher]] in his works of 1922 and 1925.<ref>{{cite journal | last=Fisher | first=R.A. | title=The goodness of fit of regression formulae, and the distribution of regression coefficients | journal=Journal of the Royal Statistical Society | volume=85 | pages=597–612 | year=1922 | doi=10.2307/2341124 | pmc=1084801 | jstor=2341124 | issue=4 }}</ref><ref name="FisherR1954Statistical">{{Cite book | author = Ronald A. Fisher | title = Statistical Methods for Research Workers | publisher = Oliver and Boyd | location = [[Edinburgh]] | year = 1970 | edition = Twelfth | url = https://archive.org/details/dli.scoerat.2986statisticalmethodsforresearchworkers/page/n7/mode/2up | isbn = 978-0-05-002170-5 | author-link = Ronald A. Fisher | url-access = registration }}</ref><ref>{{cite journal | last=Aldrich | first=John | journal=Statistical Science | year=2005 | title=Fisher and Regression | volume=20 | issue=4 | pages=401–417 | jstor=20061201 | doi=10.1214/088342305000000331| doi-access=free | url=https://eprints.soton.ac.uk/34871/1/088342305000000331.pdf }}</ref> Fisher assumed that the [[conditional distribution]] of the response variable is Gaussian, but the joint distribution need not be. In this respect, Fisher's assumption is closer to Gauss's formulation of 1821. In the 1950s and 1960s, economists used [[Calculator#Precursors to the electronic calculator|electromechanical desk calculators]] to calculate regressions. Before 1970, it sometimes took up to 24 hours to receive the result from one regression.<ref>Rodney Ramcharan. [http://www.imf.org/external/pubs/ft/fandd/2006/03/basics.htm Regressions: Why Are Economists Obessessed with Them?] March 2006. Accessed 2011-12-03.</ref> Regression methods continue to be an area of active research. In recent decades, new methods have been developed for [[robust regression]], regression involving correlated responses such as [[time series]] and [[growth curve (statistics)|growth curve]]s, regression in which the predictor (independent variable) or response variables are curves, images, graphs, or other complex data objects, regression methods accommodating various types of missing data, [[nonparametric regression]], [[Bayesian statistics|Bayesian]] methods for regression, regression in which the predictor variables are measured with error, regression with more predictor variables than observations, and [[causal inference]] with regression. Modern regression analysis is typically done with statistical and [[spreadsheet]] software packages on computers as well as on handheld [[scientific calculator|scientific]] and [[graphing calculator]]s.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)