Editing Pearson correlation coefficient (section)

==Variants==
{{See also|Correlation and dependence#Other measures of dependence among random variables}}

Variations of the correlation coefficient can be calculated for different purposes. Here are some examples.

===Adjusted correlation coefficient===
The sample correlation coefficient {{mvar|r}} is not an unbiased estimate of {{mvar|ρ}}. For data that follows a [[bivariate normal distribution]], the expectation {{math|E[''r'']}} for the sample correlation coefficient  {{mvar|r}} of a normal bivariate is<ref>{{Cite journal | first = H. | last = Hotelling | year = 1953 | title = New Light on the Correlation Coefficient and its Transforms | journal = Journal of the Royal Statistical Society. Series B (Methodological) | volume = 15 | issue = 2 | pages = 193–232 | jstor = 2983768| doi = 10.1111/j.2517-6161.1953.tb00135.x }}</ref>

:<math>\operatorname\mathbb{E}\left[r\right] = \rho - \frac{\rho \left(1 - \rho^2\right)}{2n} + \cdots, \quad</math> therefore {{mvar|r}} is a biased estimator of <math>\rho.</math>

The unique minimum variance unbiased estimator {{math|''r''<sub>adj</sub>}} is given by<ref>{{Cite journal | first=Ingram | last=Olkin |author2=Pratt, John W.  |date=March 1958  | title=Unbiased Estimation of Certain Correlation Coefficients | journal=The Annals of Mathematical Statistics | volume=29| issue=1 | pages=201–211 | jstor=2237306 | doi=10.1214/aoms/1177706717| doi-access=free }}.</ref>

{{NumBlk|:|<math> r_\text{adj} = r \, \mathbf{_2F_1}\left(\frac{1}{2}, \frac{1}{2}; \frac{n - 1}{2}; 1 - r^2\right),</math>|{{EquationRef|1}}}}

where:
*<math>r, n</math> are defined as above,
*<math>\mathbf{_2 F_1}(a, b; c; z)</math> is the [[hypergeometric function|Gaussian hypergeometric function]].

An approximately unbiased estimator {{math|''r''<sub>adj</sub>}} can be obtained{{citation needed|date=April 2012}} by truncating {{math|E[''r'']}} and solving this truncated equation:

{{NumBlk|:|<math> r = \operatorname\mathbb{E}[r] \approx r_\text{adj} - \frac{r_\text{adj} \left(1 - r_\text{adj}^2\right)}{2n}.</math>|{{EquationRef|2}}}}

An approximate solution{{citation needed|date=April 2012}} to equation ({{EquationNote|2}}) is

{{NumBlk|:|<math> r_\text{adj} \approx r \left[1 + \frac{1 - r^2}{2n}\right],</math>|{{EquationRef|3}}}}

where in ({{EquationNote|3}})
*<math>r, n</math> are defined as above,
*{{math|''r''<sub>adj</sub>}} is a suboptimal estimator,{{citation needed|date=April 2012}}{{clarify|date=February 2015| reason=suboptimal in what sense?}}
*{{math|''r''<sub>adj</sub>}} can also be obtained by maximizing log(''f''(''r'')),
*{{math|''r''<sub>adj</sub>}} has minimum variance for large values of {{mvar|n}},
*{{math|''r''<sub>adj</sub>}} has a bias of order {{math|{{frac|1|(''n'' − 1)}}}}.

Another proposed<ref name="RealCorBasic"/> adjusted correlation coefficient is{{citation needed|date=February 2015|reason=is this in a published article?}}

:<math>r_\text{adj}=\sqrt{1-\frac{(1-r^2)(n-1)}{(n-2)}}.</math>

{{math|''r''<sub>adj</sub> ≈ ''r''}} for large values of&nbsp;{{mvar|n}}.

===Weighted correlation coefficient===
Suppose observations to be correlated have differing degrees of importance that can be expressed with a weight vector ''w''. To calculate the correlation between vectors ''x'' and ''y'' with the weight vector ''w'' (all of length&nbsp;''n''),<ref>{{cite web|url=http://sci.tech-archive.net/Archive/sci.stat.math/2006-02/msg00171.html|title=Re: Compute a weighted correlation|website=sci.tech-archive.net}}</ref><ref>{{cite web|url=http://www.mathworks.com/matlabcentral/fileexchange/20846-weighted-correlation-matrix|title=Weighted Correlation Matrix – File Exchange – MATLAB Central}}</ref>

* Weighted mean: <math display="block">\operatorname{m}(x; w) = \frac{\sum_i w_i x_i}{\sum_i w_i}.</math>
* Weighted covariance <math display="block">\operatorname{cov}(x,y;w) = \frac{\sum_i w_i \cdot (x_i - \operatorname{m}(x; w)) (y_i - \operatorname{m}(y; w))}{\sum_i w_i }.</math>
* Weighted correlation <math display="block">\operatorname{corr}(x,y;w) = \frac{\operatorname{cov}(x,y;w)}{\sqrt{\operatorname{cov}(x,x;w) \operatorname{cov}(y,y;w)}}.</math>

===Reflective correlation coefficient===
The reflective correlation is a variant of Pearson's correlation in which the data are not centered around their mean values.{{citation needed|date=January 2011}} The population reflective correlation is

:<math>\operatorname{corr}_r(X,Y) = \frac{\operatorname\mathbb{E}[\,X\,Y\,]}{\sqrt{\operatorname\mathbb{E}[\,X^2\,]\cdot \operatorname\mathbb{E}[\,Y^2\,]}}.</math>

The reflective correlation is symmetric, but it is not invariant under translation:

:<math>\operatorname{corr}_r(X, Y) = \operatorname{corr}_r(Y, X) = \operatorname{corr}_r(X, bY) \neq \operatorname{corr}_r(X, a + b Y), \quad a \neq 0, b > 0.</math>

The sample reflective correlation is equivalent to [[cosine similarity]]:

:<math>rr_{xy} = \frac{\sum x_i y_i}{\sqrt{(\sum x_i^2)(\sum y_i^2)}}.</math>

The weighted version of the sample reflective correlation is

:<math>rr_{xy, w} = \frac{\sum w_i x_i y_i}{\sqrt{(\sum w_i x_i^2)(\sum w_i y_i^2)}}.</math>

===Scaled correlation coefficient===
{{Main|Scaled correlation}}

Scaled correlation is a variant of Pearson's correlation in which the range of the data is restricted intentionally and in a controlled manner to reveal correlations between fast components in [[time series]].<ref name = "Nikolicetal">{{cite journal | last1 = Nikolić | first1 = D | last2 = Muresan | first2 = RC | last3 = Feng | first3 = W | last4 = Singer | first4 = W | year = 2012 | title = Scaled correlation analysis: a better way to compute a cross-correlogram | url = http://www.danko-nikolic.com/wp-content/uploads/2012/03/Scaled-correlation-analysis.pdf | journal = European Journal of Neuroscience | volume =  35| issue = 5| pages = 1–21 | doi = 10.1111/j.1460-9568.2011.07987.x | pmid = 22324876 | s2cid = 4694570 }}</ref> Scaled correlation is defined as average correlation across short segments of data.

Let <math>K</math> be the number of segments that can fit into the total length of the signal <math>T</math> for a given scale <math>s</math>:

:<math>K = \operatorname{round}\left(\frac{T}{s}\right).</math>

The scaled correlation across the entire signals <math>\bar{r}_s</math> is then computed as

:<math>\bar{r}_s = \frac{1}{K} \sum\limits_{k=1}^K r_k,</math>

where <math>r_k</math> is Pearson's coefficient of correlation for segment <math>k</math>.

By choosing the parameter <math>s</math>, the range of values is reduced and the correlations on long time scale are filtered out, only the correlations on short time scales being revealed. Thus, the contributions of slow components are removed and those of fast components are retained.

===Pearson's distance===
A distance metric for two variables ''X'' and ''Y'' known as ''Pearson's distance'' can be defined from their correlation coefficient as<ref>Fulekar (Ed.), M.H. (2009) ''Bioinformatics: Applications in Life and Environmental Sciences'', Springer (pp. 110) {{isbn|1-4020-8879-5}}</ref>
:<math>d_{X,Y}=1-\rho_{X,Y}.</math>
Considering that the Pearson correlation coefficient falls between [−1, +1], the Pearson distance lies in [0, 2]. The Pearson distance has been used in [[cluster analysis]] and data detection for communications and storage with unknown gain and offset.<ref>{{cite journal
 |author1=Immink, K. Schouhamer
 |author2=Weber, J. 
 |title=Minimum Pearson distance detection for multilevel channels with gain and / or offset mismatch 
 |date=October 2010
 |journal=IEEE Transactions on Information Theory 
 |volume=60 |issue=10 |pages=5966–5974
 |doi=10.1109/tit.2014.2342744 |citeseerx=10.1.1.642.9971
 |s2cid=1027502 
 |url=https://www.researchgate.net/publication/265604603
 |access-date=11 February 2018}}</ref>

The Pearson "distance" defined this way assigns distance greater than 1 to negative correlations. In reality, both strong positive correlation and negative correlations are meaningful, so care must be taken when Pearson "distance" is used for nearest neighbor algorithm as such algorithm will only include neighbors with positive correlation and exclude neighbors with negative correlation. Alternatively, an absolute valued distance, <math>d_{X,Y}=1-|\rho_{X,Y}|</math>, can be applied, which will take both positive and negative correlations into consideration. The information on positive and negative association can be extracted separately, later.

===Circular correlation coefficient{{anchor|Circular}}===
{{further|Circular statistics}}

For variables ''X'' = {''x''<sub>1</sub>,...,''x''<sub>''n''</sub>} and ''Y'' = {''y''<sub>1</sub>,...,''y''<sub>''n''</sub>} that are defined on the unit circle {{Not a typo|{{closed-open|0, 2π}}}}, it is possible to define a circular analog of Pearson's coefficient.<ref name="SRJ">{{cite book |title=Topics in circular statistics |last1=Jammalamadaka |first1=S. Rao |last2=SenGupta |first2=A. |year=2001 |publisher=World Scientific |location=New Jersey |isbn=978-981-02-3778-3 |page=176 |url=https://books.google.com/books?id=sKqWMGqQXQkC&q=Jammalamadaka+Topics+in+circular |access-date=21 September 2016}}</ref> This is done by transforming data points in ''X'' and ''Y'' with a [[sine]] function such that the correlation coefficient is given as:

:<math>r_\text{circular} = \frac{\sum ^n _{i=1}\sin(x_i - \bar{x}) \sin(y_i - \bar{y})}{\sqrt{\sum^n_{i=1} \sin(x_i - \bar{x})^2} \sqrt{\sum ^n_{i=1} \sin(y_i - \bar{y})^2}}</math>

where <math>\bar{x}</math> and <math>\bar{y}</math> are the [[Mean of circular quantities|circular means]] of ''X'' and&nbsp;''Y''. This measure can be useful in fields like meteorology where the angular direction of data is important.

===Partial correlation===
{{Main|Partial correlation}}
If a population or data-set is characterized by more than two variables, a [[partial correlation]] coefficient measures the strength of dependence between a pair of variables that is not accounted for by the way in which they both change in response to variations in a selected subset of the other variables.

===Pearson correlation coefficient in quantum systems===
For two observables, <math>X</math> and <math>Y</math>, in a bipartite quantum system Pearson correlation coefficient is defined as <ref>{{cite journal |first=M. D. |last=Reid |date=1 July 1989 |title=Demonstration of the Einstein-Podolsky-Rosen paradox using nondegenerate parametric amplification |journal=Physical Review A |volume=40 |issue=2 |pages=913–923 |doi=10.1103/PhysRevA.40.913 |url=https://journals.aps.org/pra/abstract/10.1103/PhysRevA.40.913}}</ref><ref>{{cite journal |author1=Maccone, L. |author2= Dagmar, B. |author3= Macchiavello, C. |date=1 April 2015 |title=Complementarity and Correlations |journal=Physical Review Letters |volume=114 |issue=13 |pages=130401 |doi= 10.1103/PhysRevLett.114.130401 |url=https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.130401|arxiv=1408.6851 }}</ref>

:<math>\mathbb{Cor}(X,Y) = \frac{\mathbb{E}[X \otimes Y] - \mathbb{E}[X] \cdot \mathbb{E}[Y]}{\sqrt{\mathbb{V}[X] \cdot \mathbb{V}[Y]}} \,,</math>

where
*<math> \mathbb{E}[X] </math> is the expectation value of the observable <math> X </math>,
*<math> \mathbb{E}[Y] </math> is the expectation value of the observable <math> Y </math>,
*<math> \mathbb{E}[X \otimes Y] </math> is the expectation value of the observable <math> X \otimes Y </math>,
*<math> \mathbb{V}[X] </math> is the variance of the observable <math> X </math>, and
*<math> \mathbb{V}[Y] </math> is the variance of the observable <math> Y </math>.

<math>\mathbb{Cor}(X,Y)</math> is symmetric, i.e., <math>\mathbb{Cor}(X,Y)= \mathbb{Cor}(Y, X)</math>, and its absolute value is invariant under affine transformations.