Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Pearson correlation coefficient
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Geometric interpretation=== [[File:Regression lines.png|thumb|upright=1.5|Regression lines for {{math|1=''y'' = ''g''<sub>''X''</sub>(''x'')}} [{{color|red|red}}] and {{math|1=''x'' = ''g''<sub>''Y''</sub>(''y'')}} [{{color|blue|blue}}]]] For uncentered data, there is a relation between the correlation coefficient and the angle ''Ο'' between the two regression lines, {{nowrap|1=''y'' = ''g''<sub>''X''</sub>(''x'')}} and {{nowrap|1=''x'' = ''g''<sub>''Y''</sub>(''y'')}}, obtained by regressing ''y'' on ''x'' and ''x'' on ''y'' respectively. (Here, ''Ο'' is measured counterclockwise within the first quadrant formed around the lines' intersection point if {{math|''r'' > 0}}, or counterclockwise from the fourth to the second quadrant if {{nowrap|''r'' < 0}}.) One can show<ref>{{cite journal |last=Schmid |first=John Jr. |title=The relationship between the coefficient of correlation and the angle included between regression lines |journal=The Journal of Educational Research |date=December 1947 |volume=41 |issue=4 |pages=311β313 |jstor=27528906 |doi=10.1080/00220671.1947.10881608}}</ref> that if the standard deviations are equal, then {{nowrap|1=''r'' = sec ''Ο'' β tan ''Ο''}}, where sec and tan are [[trigonometric functions]]. For centered data (i.e., data which have been shifted by the sample means of their respective variables so as to have an average of zero for each variable), the correlation coefficient can also be viewed as the [[cosine]] of the [[angle]] ''ΞΈ'' between the two observed [[Vector (geometry)|vectors]] in ''N''-dimensional space (for ''N'' observations of each variable).<ref>{{cite web |last=Rummel |first=R.J. |title=Understanding Correlation |year=1976 |url=http://www.hawaii.edu/powerkills/UC.HTM |at=ch. 5 (as illustrated for a special case in the next paragraph)}}</ref> Both the uncentered (non-Pearson-compliant) and centered correlation coefficients can be determined for a dataset. As an example, suppose five countries are found to have gross national products of 1, 2, 3, 5, and 8 billion dollars, respectively. Suppose these same five countries (in the same order) are found to have 11%, 12%, 13%, 15%, and 18% poverty. Then let '''x''' and '''y''' be ordered 5-element vectors containing the above data: {{nowrap|1='''x''' = (1, 2, 3, 5, 8)}} and {{nowrap|1='''y''' = (0.11, 0.12, 0.13, 0.15, 0.18)}}. By the usual procedure for finding the angle ''ΞΈ'' between two vectors (see [[dot product]]), the ''uncentered'' correlation coefficient is :<math> \cos \theta = \frac { \mathbf{x} \cdot \mathbf{y} } { \left\| \mathbf{x} \right\| \left\| \mathbf{y} \right\|} = \frac {2.93} { \sqrt{103} \sqrt{0.0983} } = 0.920814711. </math> This uncentered correlation coefficient is identical with the [[cosine similarity]]. The above data were deliberately chosen to be perfectly correlated: {{math|1=''y'' = 0.10 + 0.01 ''x''}}. The Pearson correlation coefficient must therefore be exactly one. Centering the data (shifting '''x''' by {{math|1=β°('''x''') = 3.8}} and '''y''' by {{math|1=β°('''y''') = 0.138}}) yields {{math|1='''x''' = (β2.8, β1.8, β0.8, 1.2, 4.2)}} and {{math|1='''y''' = (β0.028, β0.018, β0.008, 0.012, 0.042)}}, from which :<math> \cos \theta = \frac{\mathbf{x} \cdot \mathbf{y}} {\left\| \mathbf{x} \right\| \left\| \mathbf{y} \right\|} = \frac {0.308}{\sqrt{30.8}\sqrt{0.00308}} = 1 = \rho_{xy}, </math> as expected.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)