Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Pearson correlation coefficient
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Decorrelation of ''n'' random variables== {{main|Decorrelation}} It is always possible to remove the correlations between all pairs of an arbitrary number of random variables by using a data transformation, even if the relationship between the variables is nonlinear. A presentation of this result for population distributions is given by Cox & Hinkley.<ref>{{cite book |author1=Cox, D.R. |author2=Hinkley, D.V. |year=1974 |title=Theoretical Statistics |publisher=Chapman & Hall |at=Appendix 3 |isbn=0-412-12420-3}}</ref> A corresponding result exists for reducing the sample correlations to zero. Suppose a vector of ''n'' random variables is observed ''m'' times. Let ''X'' be a matrix where <math>X_{i,j}</math> is the ''j''th variable of observation ''i''. Let <math>Z_{m,m}</math> be an ''m'' by ''m'' square matrix with every element 1. Then ''D'' is the data transformed so every random variable has zero mean, and ''T'' is the data transformed so all variables have zero mean and zero correlation with all other variables β the sample [[correlation matrix]] of ''T'' will be the identity matrix. This has to be further divided by the standard deviation to get unit variance. The transformed variables will be uncorrelated, even though they may not be [[Statistical independence|independent]]. :<math>D = X -\frac{1}{m} Z_{m,m} X</math> :<math>T = D (D^{\mathsf{T}} D)^{-\frac{1}{2}},</math> where an exponent of {{frac|β|1|2}} represents the [[matrix square root]] of the [[matrix inverse|inverse]] of a matrix. The correlation matrix of ''T'' will be the identity matrix. If a new data observation ''x'' is a row vector of ''n'' elements, then the same transform can be applied to ''x'' to get the transformed vectors ''d'' and ''t'': :<math>d = x - \frac{1}{m} Z_{1,m} X,</math> :<math>t = d (D^{\mathsf{T}} D)^{-\frac{1}{2}}.</math> This decorrelation is related to [[principal components analysis]] for multivariate data.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)