Editing Covariance (section)

== Properties ==
===Covariance with itself===
The [[variance]] is a special case of the covariance in which the two variables are identical:<ref name=KunIlPark/>{{rp|p=121}}
<math display="block">\operatorname{cov}(X, X) = \operatorname{var}(X)\equiv\sigma^2(X)\equiv\sigma_X^2.</math>

===Covariance of linear combinations===
If <math>X</math>, <math>Y</math>, <math>W</math>, and <math>V</math> are real-valued random variables and <math>a,b,c,d</math> are real-valued constants, then the following facts are a consequence of the definition of covariance:
<math display="block">
\begin{align}
    \operatorname{cov}(X, a) &= 0 \\
    \operatorname{cov}(X, X) &= \operatorname{var}(X) \\
    \operatorname{cov}(X, Y) &= \operatorname{cov}(Y, X) \\
    \operatorname{cov}(aX, bY) &= ab\, \operatorname{cov}(X, Y) \\
    \operatorname{cov}(X+a, Y+b) &= \operatorname{cov}(X, Y) \\ 
    \operatorname{cov}(aX+bY, cW+dV) &= ac\,\operatorname{cov}(X,W)+ad\,\operatorname{cov}(X,V)+bc\,\operatorname{cov}(Y,W)+bd\,\operatorname{cov}(Y,V)
\end{align}
</math>

For a sequence <math>X_1,\ldots,X_n</math> of random variables in real-valued, and constants <math>a_1,\ldots,a_n</math>, we have
<math display="block">\operatorname{var}\left(\sum_{i=1}^n a_iX_i \right) = \sum_{i=1}^n a_i^2\sigma^2(X_i) + 2\sum_{i,j\,:\,i<j} a_ia_j\operatorname{cov}(X_i,X_j) = \sum_{i,j} {a_ia_j\operatorname{cov}(X_i,X_j)}
</math>

===Hoeffding's covariance identity===
A useful identity to compute the covariance between two random variables <math>X, Y </math>  is the Hoeffding's covariance identity:<ref>{{cite book| last1=Papoulis| title=Probability, Random Variables and Stochastic Processes| date=1991| publisher=McGraw-Hill}}</ref>
<math display="block">\operatorname{cov}(X, Y) = \int_\mathbb R \int_\mathbb R \left(F_{(X, Y)}(x, y) - F_X(x)F_Y(y)\right) \,dx \,dy</math>
where <math> F_{(X,Y)}(x,y) </math> is the joint cumulative distribution function of the random vector <math> (X, Y) </math> and <math> F_X(x), F_Y(y) </math> are the [[Marginal distribution|marginals]].

===Uncorrelatedness and independence===
{{main|Correlation and dependence}}
Random variables whose covariance is zero are called [[uncorrelated]].<ref name=KunIlPark/>{{rp|p= 121}} Similarly, the components of random vectors whose [[covariance matrix]] is zero in every entry outside the main diagonal are also called uncorrelated.

If <math>X</math> and <math>Y</math> are [[statistical independence|independent random variables]], then their covariance is zero.<ref name=KunIlPark/>{{rp|p= 123}}<ref>{{Cite web | url=http://www.randomservices.org/random/expect/Covariance.html| title=Covariance and Correlation | last=Siegrist|first=Kyle| publisher=University of Alabama in Huntsville |access-date=Oct 3, 2022}}</ref> This follows because under independence,
<math display="block">\operatorname{E}[XY] = \operatorname{E}[X] \cdot \operatorname{E}[Y]. </math>

The converse, however, is not generally true. For example, let <math>X</math> be uniformly distributed in <math>[-1,1]</math> and let <math>Y = X^2</math>. Clearly, <math>X</math> and <math>Y</math> are not independent, but
<math display="block">\begin{align}
  \operatorname{cov}(X, Y) &= \operatorname{cov}\left(X, X^2\right) \\
         &= \operatorname{E}\left[X \cdot X^2\right] - \operatorname{E}[X] \cdot \operatorname{E}\left[X^2\right] \\
         &= \operatorname{E}\left[X^3\right] - \operatorname{E}[X]\operatorname{E}\left[X^2\right] \\
         &= 0 - 0 \cdot \operatorname{E}[X^2] \\
         &= 0.  
\end{align}</math>

In this case, the relationship between <math>Y</math> and <math>X</math> is non-linear, while correlation and covariance are measures of linear dependence between two random variables. This example shows that if two random variables are uncorrelated, that does not in general imply that they are independent. However, if two variables are [[Multivariate normal distribution|jointly normally distributed]] (but not if they are merely [[Normally distributed and uncorrelated does not imply independent|individually normally distributed]]), uncorrelatedness ''does'' imply independence.<ref>{{Cite book |title=A modern introduction to probability and statistics: understanding why and how |date=2005 |publisher=Springer |isbn=978-1-85233-896-1 |editor-last=Dekking |editor-first=Michel |series=Springer texts in statistics |location=London [Heidelberg]}}</ref>

<math>X</math> and <math>Y</math> whose covariance is positive are called positively correlated, which implies if <math>X>E[X]</math> then likely <math>Y>E[Y]</math>. Conversely, <math>X</math> and <math>Y</math> with negative covariance are negatively correlated, and if <math>X>E[X]</math> then likely <math>Y<E[Y]</math>.

=== Relationship to inner products ===
Many of the properties of covariance can be extracted elegantly by observing that it satisfies similar properties to those of an [[inner product]]:
# [[bilinear operator|bilinear]]:  for constants <math>a</math> and <math>b</math> and random variables <math>X,Y,Z,</math> <math> \operatorname{cov}(aX+bY,Z) = a \operatorname{cov}(X,Z) + b \operatorname{cov}(Y,Z)</math>
# symmetric: <math>\operatorname{cov}(X,Y) = \operatorname{cov}(Y,X)</math>
# [[definite bilinear form|positive semi-definite]]: <math>\sigma^2(X) = \operatorname{cov}(X,X) \ge 0</math> for all random variables <math>X</math>, and <math>\operatorname{cov}(X,X) = 0</math> implies that <math>X</math> is constant [[almost surely]].

In fact these properties imply that the covariance defines an inner product over the [[quotient space (linear algebra)|quotient vector space]] obtained by taking the subspace of random variables with finite second moment and identifying any two that differ by a constant. (This identification turns the positive semi-definiteness above into positive definiteness.) That quotient vector space is isomorphic to the subspace of random variables with finite second moment and mean zero; on that subspace, the covariance is exactly the [[Lp space|L<sup>2</sup>]] inner product of real-valued functions on the sample space.

As a result, for random variables with finite variance, the inequality
<math display="block">\left|\operatorname{cov}(X, Y)\right| \le \sqrt{\sigma^2(X) \sigma^2(Y)} </math>
holds via the [[Cauchy–Schwarz inequality]].

Proof: If <math>\sigma^2(Y) = 0</math>, then it holds trivially. Otherwise, let random variable
<math display="block"> Z = X - \frac{\operatorname{cov}(X, Y)}{\sigma^2(Y)} Y.</math>

Then we have
<math display="block">\begin{align}
  0 \le \sigma^2(Z)
    &= \operatorname{cov}\left(
         X - \frac{\operatorname{cov}(X, Y)}{\sigma^2(Y)} Y,\;
         X - \frac{\operatorname{cov}(X, Y)}{\sigma^2(Y)} Y
       \right) \\[12pt]
    &= \sigma^2(X) - \frac{(\operatorname{cov}(X, Y))^2}{\sigma^2(Y)} \\
\implies (\operatorname{cov}(X, Y))^2 &\le \sigma^2(X)\sigma^2(Y) \\
\left|\operatorname{cov}(X, Y)\right| &\le \sqrt{\sigma^2(X)\sigma^2(Y)}
\end{align}</math>