Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Variance
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====Sum of correlated variables==== =====Sum of correlated variables with fixed sample size===== {{main article|Bienaymé's identity}} In general, the variance of the sum of {{math|n}} variables is the sum of their [[covariance]]s: <math display="block">\operatorname{Var}\left(\sum_{i=1}^n X_i\right) = \sum_{i=1}^n \sum_{j=1}^n \operatorname{Cov}\left(X_i, X_j\right) = \sum_{i=1}^n \operatorname{Var}\left(X_i\right) + 2 \sum_{1 \leq i < j\leq n} \operatorname{Cov}\left(X_i, X_j\right).</math> (Note: The second equality comes from the fact that {{math|1=Cov(''X''<sub>''i''</sub>,''X''<sub>''i''</sub>) = Var(''X''<sub>''i''</sub>)}}.) Here, <math>\operatorname{Cov}(\cdot,\cdot)</math> is the [[covariance]], which is zero for independent random variables (if it exists). The formula states that the variance of a sum is equal to the sum of all elements in the covariance matrix of the components. The next expression states equivalently that the variance of the sum is the sum of the diagonal of covariance matrix plus two times the sum of its upper triangular elements (or its lower triangular elements); this emphasizes that the covariance matrix is symmetric. This formula is used in the theory of [[Cronbach's alpha]] in [[classical test theory]]. So, if the variables have equal variance ''σ''<sup>2</sup> and the average [[correlation]] of distinct variables is ''ρ'', then the variance of their mean is <math display="block">\operatorname{Var}\left(\overline{X}\right) = \frac{\sigma^2}{n} + \frac{n - 1}{n}\rho\sigma^2.</math> This implies that the variance of the mean increases with the average of the correlations. In other words, additional correlated observations are not as effective as additional independent observations at reducing the [[standard error|uncertainty of the mean]]. Moreover, if the variables have unit variance, for example if they are standardized, then this simplifies to <math display="block">\operatorname{Var}\left(\overline{X}\right) = \frac{1}{n} + \frac{n - 1}{n}\rho.</math> This formula is used in the [[Spearman–Brown prediction formula]] of classical test theory. This converges to ''ρ'' if ''n'' goes to infinity, provided that the average correlation remains constant or converges too. So for the variance of the mean of standardized variables with equal correlations or converging average correlation we have <math display="block">\lim_{n \to \infty} \operatorname{Var}\left(\overline{X}\right) = \rho.</math> Therefore, the variance of the mean of a large number of standardized variables is approximately equal to their average correlation. This makes clear that the sample mean of correlated variables does not generally converge to the population mean, even though the [[law of large numbers]] states that the sample mean will converge for independent variables. =====Sum of uncorrelated variables with random sample size===== There are cases when a sample is taken without knowing, in advance, how many observations will be acceptable according to some criterion. In such cases, the sample size {{math|N}} is a random variable whose variation adds to the variation of {{math|X}}, such that,<ref>Cornell, J R, and Benjamin, C A, ''Probability, Statistics, and Decisions for Civil Engineers,'' McGraw-Hill, NY, 1970, pp.178-9.</ref> <math display="block">\operatorname{Var}\left(\sum_{i=1}^{N}X_i\right)=\operatorname{E}\left[N\right]\operatorname{Var}(X)+\operatorname{Var}(N)(\operatorname{E}\left[X\right])^2</math> which follows from the [[law of total variance]]. If {{math|N}} has a [[Poisson distribution]], then <math>\operatorname{E}[N]=\operatorname{Var}(N)</math> with estimator {{math|n}} = {{math|N}}. So, the estimator of <math>\operatorname{Var}\left(\sum_{i=1}^{n}X_i\right)</math> becomes <math>n{S_x}^2+n\bar{X}^2</math>, giving <math>\operatorname{SE}(\bar{X})=\sqrt{\frac{{S_x}^2+\bar{X}^2}{n}}</math> (see [[Standard error#Standard_error_of_the_sample_mean|standard error of the sample mean]]).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)