Editing Weighted arithmetic mean (section)

===Weighted sample covariance===
In a weighted sample, each row vector <math> \mathbf{x}_{i}</math> (each set of single observations on each of the ''K'' random variables) is assigned a weight <math>w_i \geq0</math>.

Then the [[weighted mean]] vector <math> \mathbf{\mu^*}</math> is given by

:<math> \mathbf{\mu^*}=\frac{\sum_{i=1}^N w_i \mathbf{x}_i}{\sum_{i=1}^N w_i}.</math>

And the weighted covariance matrix is given by:<ref name="PRICE-1972">{{cite journal |last1=Price |first1=George R. |title=Extension of covariance selection mathematics |journal=Annals of Human Genetics |date=April 1972 |volume=35 |issue=4 |pages=485–490 |doi=10.1111/j.1469-1809.1957.tb01874.x|pmid=5073694 |s2cid=37828617 |url=http://www.dynamics.org/Altenberg/LIBRARY/REPRINTS/Price_extension_AnnHumGenetLond.1972.pdf}}</ref>

:<math>\mathbf{C} = \frac {\sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right)} {V_1}.</math>

Similarly to weighted sample variance, there are two different unbiased estimators depending on the type of the weights.

====Frequency weights====
If the weights are ''frequency weights'', the ''unbiased'' weighted estimate of the covariance matrix <math>\textstyle \mathbf{C}</math>, with Bessel's correction, is given by:<ref name="PRICE-1972"/>

:<math>\mathbf{C} = \frac {\sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right)} {V_1 - 1}.</math>

This estimator can be unbiased only if the weights are not [[Standard score|standardized]] nor [[Normalization (statistics)|normalized]], these processes changing the data's mean and variance and thus leading to a [[Base rate fallacy|loss of the base rate]] (the population count, which is a requirement for Bessel's correction).

==== Reliability weights ====
In the case of ''reliability weights'', the weights are [[Normalizing constant|normalized]]:
: <math> V_1 = \sum_{i=1}^N w_i = 1. </math>

(If they are not, divide the weights by their sum to normalize prior to calculating <math>V_1</math>:

: <math> w_i' = \frac{w_i}{\sum_{i=1}^N w_i} </math>

Then the [[weighted mean]] vector <math> \mathbf{\mu^*}</math> can be simplified to

:<math> \mathbf{\mu^*}=\sum_{i=1}^N w_i \mathbf{x}_i.</math>

and the ''unbiased'' weighted estimate of the covariance matrix <math> \mathbf{C}</math> is:<ref name="Galassi-2007-GSL">Mark Galassi, Jim Davies, James Theiler, Brian Gough, Gerard Jungman, Michael Booth, and Fabrice Rossi. [https://www.gnu.org/software/gsl/manual GNU Scientific Library - Reference manual, Version 1.15], 2011.
[https://www.gnu.org/software/gsl/manual/html_node/Weighted-Samples.html Sec. 21.7 Weighted Samples]</ref>

:<math>
\begin{align}
\mathbf{C} &= \frac{\sum_{i=1}^N w_i}{\left(\sum_{i=1}^N w_i\right)^2-\sum_{i=1}^N w_i^2} \sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right) \\
&= \frac {\sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right)} {V_1 - (V_2 / V_1)}.
\end{align}
</math>

The reasoning here is the same as in the previous section.

Since we are assuming the weights are normalized, then <math>V_1 = 1</math> and this reduces to:

: <math>\mathbf{C}=\frac{\sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right)}{1-V_2}.</math>

If all weights are the same, i.e. <math> w_{i} / V_1=1/N</math>, then the weighted mean and covariance reduce to the unweighted sample mean and covariance above.