Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Variance
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Decomposition=== The general formula for variance decomposition or the [[law of total variance]] is: If <math>X</math> and <math>Y</math> are two random variables, and the variance of <math>X</math> exists, then <math display="block">\operatorname{Var}[X] = \operatorname{E}(\operatorname{Var}[X\mid Y]) + \operatorname{Var}(\operatorname{E}[X\mid Y]).</math> The [[conditional expectation]] <math>\operatorname E(X\mid Y)</math> of <math>X</math> given <math>Y</math>, and the [[conditional variance]] <math>\operatorname{Var}(X\mid Y)</math> may be understood as follows. Given any particular value ''y'' of the random variable ''Y'', there is a conditional expectation <math>\operatorname E(X\mid Y=y)</math> given the event ''Y'' = ''y''. This quantity depends on the particular value ''y''; it is a function <math> g(y) = \operatorname E(X\mid Y=y)</math>. That same function evaluated at the random variable ''Y'' is the conditional expectation <math>\operatorname E(X\mid Y) = g(Y).</math> In particular, if <math>Y</math> is a discrete random variable assuming possible values <math>y_1, y_2, y_3 \ldots</math> with corresponding probabilities <math>p_1, p_2, p_3 \ldots, </math>, then in the formula for total variance, the first term on the right-hand side becomes <math display="block">\operatorname{E}(\operatorname{Var}[X \mid Y]) = \sum_i p_i \sigma^2_i,</math> where <math>\sigma^2_i = \operatorname{Var}[X \mid Y = y_i]</math>. Similarly, the second term on the right-hand side becomes <math display="block">\operatorname{Var}(\operatorname{E}[X \mid Y]) = \sum_i p_i \mu_i^2 - \left(\sum_i p_i \mu_i\right)^2 = \sum_i p_i \mu_i^2 - \mu^2,</math> where <math>\mu_i = \operatorname{E}[X \mid Y = y_i]</math> and <math>\mu = \sum_i p_i \mu_i</math>. Thus the total variance is given by <math display="block">\operatorname{Var}[X] = \sum_i p_i \sigma^2_i + \left( \sum_i p_i \mu_i^2 - \mu^2 \right).</math> A similar formula is applied in [[analysis of variance]], where the corresponding formula is <math display="block">\mathit{MS}_\text{total} = \mathit{MS}_\text{between} + \mathit{MS}_\text{within};</math> here <math>\mathit{MS}</math> refers to the Mean of the Squares. In [[linear regression]] analysis the corresponding formula is <math display="block">\mathit{MS}_\text{total} = \mathit{MS}_\text{regression} + \mathit{MS}_\text{residual}.</math> This can also be derived from the additivity of variances, since the total (observed) score is the sum of the predicted score and the error score, where the latter two are uncorrelated. Similar decompositions are possible for the sum of squared deviations (sum of squares, <math>\mathit{SS}</math>): <math display="block">\mathit{SS}_\text{total} = \mathit{SS}_\text{between} + \mathit{SS}_\text{within},</math> <math display="block">\mathit{SS}_\text{total} = \mathit{SS}_\text{regression} + \mathit{SS}_\text{residual}.</math>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)