Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Weighted arithmetic mean
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Weighted sample covariance=== In a weighted sample, each row vector <math> \mathbf{x}_{i}</math> (each set of single observations on each of the ''K'' random variables) is assigned a weight <math>w_i \geq0</math>. Then the [[weighted mean]] vector <math> \mathbf{\mu^*}</math> is given by :<math> \mathbf{\mu^*}=\frac{\sum_{i=1}^N w_i \mathbf{x}_i}{\sum_{i=1}^N w_i}.</math> And the weighted covariance matrix is given by:<ref name="PRICE-1972">{{cite journal |last1=Price |first1=George R. |title=Extension of covariance selection mathematics |journal=Annals of Human Genetics |date=April 1972 |volume=35 |issue=4 |pages=485β490 |doi=10.1111/j.1469-1809.1957.tb01874.x|pmid=5073694 |s2cid=37828617 |url=http://www.dynamics.org/Altenberg/LIBRARY/REPRINTS/Price_extension_AnnHumGenetLond.1972.pdf}}</ref> :<math>\mathbf{C} = \frac {\sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right)} {V_1}.</math> Similarly to weighted sample variance, there are two different unbiased estimators depending on the type of the weights. ====Frequency weights==== If the weights are ''frequency weights'', the ''unbiased'' weighted estimate of the covariance matrix <math>\textstyle \mathbf{C}</math>, with Bessel's correction, is given by:<ref name="PRICE-1972"/> :<math>\mathbf{C} = \frac {\sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right)} {V_1 - 1}.</math> This estimator can be unbiased only if the weights are not [[Standard score|standardized]] nor [[Normalization (statistics)|normalized]], these processes changing the data's mean and variance and thus leading to a [[Base rate fallacy|loss of the base rate]] (the population count, which is a requirement for Bessel's correction). ==== Reliability weights ==== In the case of ''reliability weights'', the weights are [[Normalizing constant|normalized]]: : <math> V_1 = \sum_{i=1}^N w_i = 1. </math> (If they are not, divide the weights by their sum to normalize prior to calculating <math>V_1</math>: : <math> w_i' = \frac{w_i}{\sum_{i=1}^N w_i} </math> Then the [[weighted mean]] vector <math> \mathbf{\mu^*}</math> can be simplified to :<math> \mathbf{\mu^*}=\sum_{i=1}^N w_i \mathbf{x}_i.</math> and the ''unbiased'' weighted estimate of the covariance matrix <math> \mathbf{C}</math> is:<ref name="Galassi-2007-GSL">Mark Galassi, Jim Davies, James Theiler, Brian Gough, Gerard Jungman, Michael Booth, and Fabrice Rossi. [https://www.gnu.org/software/gsl/manual GNU Scientific Library - Reference manual, Version 1.15], 2011. [https://www.gnu.org/software/gsl/manual/html_node/Weighted-Samples.html Sec. 21.7 Weighted Samples]</ref> :<math> \begin{align} \mathbf{C} &= \frac{\sum_{i=1}^N w_i}{\left(\sum_{i=1}^N w_i\right)^2-\sum_{i=1}^N w_i^2} \sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right) \\ &= \frac {\sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right)} {V_1 - (V_2 / V_1)}. \end{align} </math> The reasoning here is the same as in the previous section. Since we are assuming the weights are normalized, then <math>V_1 = 1</math> and this reduces to: : <math>\mathbf{C}=\frac{\sum_{i=1}^N w_i \left(\mathbf{x}_i - \mu^*\right)^T \left(\mathbf{x}_i - \mu^*\right)}{1-V_2}.</math> If all weights are the same, i.e. <math> w_{i} / V_1=1/N</math>, then the weighted mean and covariance reduce to the unweighted sample mean and covariance above.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)