Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Multicollinearity
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Numerical issues == Sometimes, the variables <math> X_j </math> are nearly collinear. In this case, the matrix <math>X^{\mathsf{T}}X</math> has an inverse, but it is [[ill-conditioned]]. A computer algorithm may or may not be able to compute an approximate inverse; even if it can, the resulting inverse may have large [[rounding error]]s. The standard measure of [[Condition number|ill-conditioning]] in a matrix is the condition index. This determines if the inversion of the matrix is numerically unstable with finite-precision numbers, indicating the potential sensitivity of the computed inverse to small changes in the original matrix. The condition number is computed by finding the maximum [[singular value]] divided by the minimum singular value of the [[design matrix]].<ref name="Belsley19912">{{cite book |last=Belsley |first=David |url=https://archive.org/details/conditioningdiag0000bels |title=Conditioning Diagnostics: Collinearity and Weak Data in Regression |publisher=Wiley |year=1991 |isbn=978-0-471-52889-0 |location=New York |url-access=registration}}</ref> In the context of collinear variables, the [[variance inflation factor]] is the condition number for a particular coefficient. === Solutions === Numerical problems in estimating can be solved by applying standard techniques from [[linear algebra]] to estimate the equations more precisely: # [[Standard score|'''Standardizing''']] '''predictor variables.''' Working with polynomial terms (e.g. <math>x_1</math>, <math>x_1^2</math>), including interaction terms (i.e., <math>x_1 \times x_2</math>) can cause multicollinearity. This is especially true when the variable in question has a limited range. Standardizing predictor variables will eliminate this special kind of multicollinearity for polynomials of up to 3rd order.<ref>{{Cite web |title=12.6 - Reducing Structural Multicollinearity {{!}} STAT 501 |url=https://newonlinecourses.science.psu.edu/stat501/lesson/12/12.6 |access-date=2019-03-16 |website=newonlinecourses.science.psu.edu}}</ref> #* For higher-order polynomials, an [[Orthogonal polynomials|orthogonal polynomial]] representation will generally fix any collinearity problems.<ref name=":4">{{Cite web |title=Computational Tricks with Turing (Non-Centered Parametrization and QR Decomposition) |url=https://storopoli.io/Bayesian-Julia/pages/12_Turing_tricks/#qr_decomposition |access-date=2023-09-03 |website=storopoli.io}}</ref> However, polynomial regressions are [[Runge's phenomenon|generally unstable]], making them unsuitable for [[nonparametric regression]] and inferior to newer methods based on [[smoothing spline]]s, [[LOESS]], or [[Gaussian process]] regression.<ref>{{Cite journal |last1=Gelman |first1=Andrew |last2=Imbens |first2=Guido |date=2019-07-03 |title=Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs |url=https://www.tandfonline.com/doi/full/10.1080/07350015.2017.1366909 |journal=Journal of Business & Economic Statistics |language=en |volume=37 |issue=3 |pages=447β456 |doi=10.1080/07350015.2017.1366909 |issn=0735-0015|url-access=subscription }}</ref> # '''Use an [[QR decomposition|orthogonal representation]] of the data'''.<ref name=":4" /> Poorly-written statistical software will sometimes fail to converge to a correct representation when variables are strongly correlated. However, it is still possible to rewrite the regression to use only uncorrelated variables by performing a [[change of basis]]. #* For polynomial terms in particular, it is possible to rewrite the regression as a function of uncorrelated variables using [[orthogonal polynomials]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)