Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Regression analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Underlying assumptions== {{more citations needed section|date=December 2020}} By itself, a regression is simply a calculation using the data. In order to interpret the output of regression as a meaningful statistical quantity that measures real-world relationships, researchers often rely on a number of classical [[statistical assumption|assumptions]]. These assumptions often include: *The sample is representative of the population at large. *The independent variables are measured without error. *Deviations from the model have an expected value of zero, conditional on covariates: <math>E(e_i | X_i) = 0</math> *The variance of the residuals <math>e_i</math> is constant across observations ([[homoscedasticity]]). * The residuals <math>e_i</math> are [[uncorrelated]] with one another. Mathematically, the [[Covariance matrix|variance–covariance matrix]] of the errors is [[Diagonal matrix|diagonal]]. A handful of conditions are sufficient for the least-squares estimator to possess desirable properties: in particular, the [[Gauss–Markov theorem|Gauss–Markov]] assumptions imply that the parameter estimates will be [[bias of an estimator|unbiased]], [[consistent estimator|consistent]], and [[efficient (statistics)|efficient]] in the class of linear unbiased estimators. Practitioners have developed a variety of methods to maintain some or all of these desirable properties in real-world settings, because these classical assumptions are unlikely to hold exactly. For example, modeling [[errors-in-variables model|errors-in-variables]] can lead to reasonable estimates independent variables are measured with errors. [[Heteroscedasticity-consistent standard errors]] allow the variance of <math>e_i</math> to change across values of <math>X_i</math>. Correlated errors that exist within subsets of the data or follow specific patterns can be handled using ''clustered standard errors, geographic weighted regression'', or [[Newey–West estimator|Newey–West]] standard errors, among other techniques. When rows of data correspond to locations in space, the choice of how to model <math>e_i</math> within geographic units can have important consequences.<ref>{{cite book|title=Geographically weighted regression: the analysis of spatially varying relationships|last1=Fotheringham|first1=A. Stewart|last2=Brunsdon|first2=Chris|last3=Charlton|first3=Martin|publisher=John Wiley|year=2002|isbn=978-0-471-49616-8|edition=Reprint|location=Chichester, England}}</ref><ref>{{cite journal|last=Fotheringham|first=AS|author2=Wong, DWS|date=1 January 1991|title=The modifiable areal unit problem in multivariate statistical analysis|journal=Environment and Planning A|volume=23|issue=7|pages=1025–1044|doi=10.1068/a231025|bibcode=1991EnPlA..23.1025F |s2cid=153979055}}</ref> The subfield of [[econometrics]] is largely focused on developing techniques that allow researchers to make reasonable real-world conclusions in real-world settings, where classical assumptions do not hold exactly.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)