Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Regression analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Set of statistical processes for estimating the relationships among variables}} [[File:Normdist regression.png|thumb|right|200px|Regression line for 50 random points in a [[Gaussian distribution]] around the line y=1.5x+2.]] {{Regression bar}} {{Machine learning|Problems}} In [[statistical model]]ing, '''regression analysis''' is a set of statistical processes for [[Estimation theory|estimating]] the relationships between a [[dependent variable]] (often called the ''outcome'' or ''response'' variable, or a ''label'' in machine learning parlance) and one or more error-free [[independent variable]]s (often called ''regressors'', ''predictors'', ''covariates'', ''explanatory variables'' or ''features''). The most common form of regression analysis is [[linear regression]], in which one finds the line (or a more complex [[linear combination]]) that most closely fits the data according to a specific mathematical criterion. For example, the method of [[ordinary least squares]] computes the unique line (or [[hyperplane]]) that minimizes the sum of squared differences between the true data and that line (or hyperplane). For specific mathematical reasons (see [[linear regression]]), this allows the researcher to estimate the [[conditional expectation]] (or population [[average value]]) of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative [[location parameters]] (e.g., [[quantile regression]] or [[Necessary Condition Analysis]]<ref>[http://www.erim.eur.nl/centres/necessary-condition-analysis/ Necessary Condition Analysis]</ref>) or estimate the conditional expectation across a broader collection of non-linear models (e.g., [[nonparametric regression]]). Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for [[prediction]] and [[forecasting]], where its use has substantial overlap with the field of [[machine learning]]. Second, in some situations regression analysis can be used to infer [[causality|causal relationships]] between the independent and dependent variables. Importantly, regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset. To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify why existing relationships have predictive power for a new context or why a relationship between two variables has a causal interpretation. The latter is especially important when researchers hope to estimate causal relationships using [[observational study|observational data]].<ref name="Freedman2009">{{cite book|author=David A. Freedman|title=Statistical Models: Theory and Practice|url=https://books.google.com/books?id=fW_9BV5Wpf8C&q=%22regression+analysis%22|date=27 April 2009|publisher=Cambridge University Press|isbn=978-1-139-47731-4}}</ref><ref>R. Dennis Cook; Sanford Weisberg [https://www.jstor.org/stable/270724 Criticism and Influence Analysis in Regression], ''Sociological Methodology'', Vol. 13. (1982), pp. 313β361</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)