Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Conditional expectation
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Basic properties == All the following formulas are to be understood in an almost sure sense. The ''Ο''-algebra <math>\mathcal{H}</math> could be replaced by a random variable <math>Z</math>, i.e. <math>\mathcal{H}=\sigma(Z)</math>. * Pulling out independent factors: ** If <math>X</math> is [[Independence (probability theory)|independent]] of <math>\mathcal{H}</math>, then <math>E(X\mid\mathcal{H}) = E(X)</math>. {{hidden begin|toggle=left|title=Proof}} Let <math>B \in \mathcal{H}</math>. Then <math>X</math> is independent of <math>1_B</math>, so we get that :<math>\int_B X\,dP = E(X1_B) = E(X)E(1_B) = E(X)P(B) = \int_B E(X)\,dP.</math> Thus the definition of conditional expectation is satisfied by the constant random variable <math>E(X)</math>, as desired. <math>\square</math> {{hidden end}} ** If <math>X</math> is independent of <math>\sigma(Y, \mathcal{H})</math>, then <math>E(XY\mid \mathcal{H}) = E(X) \, E(Y\mid\mathcal{H})</math>. Note that this is not necessarily the case if <math>X</math> is only independent of <math>\mathcal{H}</math> and of <math>Y</math>. ** If <math>X,Y</math> are independent, <math>\mathcal{G},\mathcal{H}</math> are independent, <math>X</math> is independent of <math>\mathcal{H}</math> and <math>Y</math> is independent of <math>\mathcal{G}</math>, then <math>E(E(XY\mid\mathcal{G})\mid\mathcal{H}) = E(X) E(Y) = E(E(XY\mid\mathcal{H})\mid\mathcal{G})</math>. * Stability: ** If <math>X</math> is <math>\mathcal{H}</math>-measurable, then <math>E(X\mid\mathcal{H}) = X</math>. {{hidden begin|toggle=left|title=Proof}} For each <math>H\in \mathcal{H}</math> we have <math>\int_H E(X\mid\mathcal{H}) \, dP = \int_H X \, dP</math>, or equivalently :<math> \int_H \big( E(X\mid\mathcal{H}) - X \big) \, dP = 0 </math> Since this is true for each <math>H \in \mathcal{H}</math>, and both <math>E(X\mid\mathcal{H})</math> and <math>X</math> are <math>\mathcal{H}</math>-measurable (the former property holds by definition; the latter property is key here), from this one can show :<math> \int_H \big| E(X\mid\mathcal{H}) - X \big| \, dP = 0 </math> And this implies <math> E(X\mid\mathcal{H}) = X</math> almost everywhere. <math>\square</math> {{hidden end}} ** In particular, for sub-''Ο''-algebras <math>\mathcal{H}_1\subset\mathcal{H}_2 \subset\mathcal{F}</math> we have <math>E(E(X\mid\mathcal{H}_1)\mid\mathcal{H}_2) = E(X\mid\mathcal{H}_1)</math>. (Note this is different from the tower property below.) ** If ''Z'' is a random variable, then <math>\operatorname{E}(f(Z) \mid Z)=f(Z)</math>. In its simplest form, this says <math>\operatorname{E}(Z \mid Z)=Z</math>. * Pulling out known factors: ** If <math>X</math> is <math>\mathcal{H}</math>-measurable, then <math>E(XY\mid\mathcal{H}) = X \, E(Y\mid\mathcal{H})</math>. {{hidden begin|toggle=left|title=Proof}} All random variables here are assumed without loss of generality to be non-negative. The general case can be treated with <math>X = X^+ - X^-</math>. Fix <math>A \in \mathcal{H}</math> and let <math>X = 1_A</math>. Then for any <math>H \in \mathcal{H}</math> :<math>\int_H E(1_A Y \mid \mathcal{H}) \, dP = \int_H 1_A Y \, dP = \int_{A \cap H} Y \, dP = \int_{A\cap H} E(Y\mid\mathcal{H}) \, dP = \int_H 1_A E(Y \mid \mathcal{H}) \, dP </math> Hence <math> E(1_A Y \mid \mathcal{H}) = 1_A E(Y\mid\mathcal{H})</math> almost everywhere. Any simple function is a finite linear combination of indicator functions. By linearity the above property holds for simple functions: if <math>X_n</math> is a simple function then <math>E(X_n Y \mid \mathcal{H}) = X_n \, E(Y\mid \mathcal{H})</math>. Now let <math>X</math> be <math>\mathcal{H}</math>-measurable. Then there exists a sequence of simple functions <math>\{ X_n \}_{n\geq 1}</math> converging monotonically (here meaning <math>X_n \leq X_{n+1}</math>) and pointwise to <math>X</math>. Consequently, for <math>Y \geq 0 </math>, the sequence <math>\{ X_n Y \}_{n\geq 1}</math> converges monotonically and pointwise to <math> X Y </math>. Also, since <math>E(Y\mid\mathcal{H}) \geq 0</math>, the sequence <math>\{ X_n E(Y\mid\mathcal{H}) \}_{n\geq 1}</math> converges monotonically and pointwise to <math>X \, E(Y\mid\mathcal{H})</math> Combining the special case proved for simple functions, the definition of conditional expectation, and deploying the monotone convergence theorem: :<math> \int_H X \, E(Y\mid\mathcal{H}) \, dP = \int_H \lim_{n \to \infty} X_n \, E(Y\mid\mathcal{H}) \, dP = \lim_{n \to \infty} \int_H X_n E(Y\mid\mathcal{H}) \, dP = \lim_{n \to \infty} \int_H E(X_n Y\mid\mathcal{H}) \, dP = \lim_{n \to \infty} \int_H X_n Y \, dP = \int_H \lim_{n\to \infty} X_n Y \, dP = \int_H XY \, dP = \int_H E(XY\mid\mathcal{H}) \, dP</math> This holds for all <math>H\in \mathcal{H}</math>, whence <math>X \, E(Y\mid\mathcal{H}) = E(XY\mid\mathcal{H})</math> almost everywhere. <math>\square</math> {{hidden end}} ** If ''Z'' is a random variable, then <math>\operatorname{E}(f(Z) Y \mid Z)=f(Z)\operatorname{E}(Y \mid Z)</math>. * [[Law of total expectation]]: <math>E(E(X \mid \mathcal{H})) = E(X)</math>.<ref>{{Cite web|title=Conditional expectation|url=https://www.statlect.com/fundamentals-of-probability/conditional-expectation|access-date=2020-09-11|website=www.statlect.com}}</ref> * Tower property: ** For sub-''Ο''-algebras <math>\mathcal{H}_1\subset\mathcal{H}_2 \subset\mathcal{F}</math> we have <math>E(E(X\mid\mathcal{H}_2)\mid\mathcal{H}_1) = E(X\mid\mathcal{H}_1)</math>. *** A special case <math>\mathcal{H}_1=\{\emptyset, \Omega\}</math> recovers the Law of total expectation: <math>E(E(X\mid\mathcal{H}_2) ) = E(X)</math>. *** A special case is when ''Z'' is a <math>\mathcal{H}</math>-measurable random variable. Then <math>\sigma(Z) \subset \mathcal{H}</math> and thus <math>E(E(X \mid \mathcal{H}) \mid Z) = E(X \mid Z)</math>. *** [[Doob martingale]] property: the above with <math>Z = E(X \mid \mathcal{H})</math> (which is <math>\mathcal{H}</math>-measurable), and using also <math>\operatorname{E}(Z \mid Z)=Z</math>, gives <math>E(X \mid E(X \mid \mathcal{H})) = E(X \mid \mathcal{H})</math>. ** For random variables <math>X,Y</math> we have <math>E(E(X\mid Y)\mid f(Y)) = E(X\mid f(Y))</math>. ** For random variables <math>X,Y,Z</math> we have <math>E(E(X\mid Y,Z)\mid Y) = E(X\mid Y)</math>. * Linearity: we have <math>E(X_1 + X_2 \mid \mathcal{H}) = E(X_1 \mid \mathcal{H}) + E(X_2 \mid \mathcal{H})</math> and <math>E(a X \mid \mathcal{H}) = a\,E(X \mid \mathcal{H})</math> for <math>a\in\R</math>. * Positivity: If <math>X \ge 0</math> then <math>E(X \mid \mathcal{H}) \ge 0</math>. * Monotonicity: If <math>X_1 \le X_2</math> then <math>E(X_1 \mid \mathcal{H}) \le E(X_2 \mid \mathcal{H})</math>. * [[Monotone convergence theorem|Monotone convergence]]: If <math>0\leq X_n \uparrow X</math> then <math>E(X_n \mid \mathcal{H}) \uparrow E(X \mid \mathcal{H})</math>. * [[Dominated convergence theorem|Dominated convergence]]: If <math>X_n \to X</math> and <math>|X_n| \le Y</math> with <math>Y \in L^1</math>, then <math>E(X_n \mid \mathcal{H}) \to E(X \mid \mathcal{H})</math>. * [[Fatou's lemma]]: If <math>\textstyle E(\inf_n X_n \mid \mathcal{H}) > -\infty</math> then <math>\textstyle E(\liminf_{n\to\infty} X_n \mid \mathcal{H}) \le \liminf_{n\to\infty} E(X_n \mid \mathcal{H})</math>. * [[Jensen's inequality]]: If <math>f \colon \mathbb{R} \rightarrow \mathbb{R}</math> is a [[convex function]], then <math>f(E(X\mid \mathcal{H})) \le E(f(X)\mid\mathcal{H})</math>. * [[Conditional variance]]: Using the conditional expectation we can define, by analogy with the definition of the [[variance]] as the mean square deviation from the average, the conditional variance ** Definition: <math>\operatorname{Var}(X \mid \mathcal{H}) = \operatorname{E}\bigl( (X - \operatorname{E}(X \mid \mathcal{H}))^2 \mid \mathcal{H} \bigr)</math> **Algebraic formula for the variance: <math>\operatorname{Var}(X \mid \mathcal{H}) = \operatorname{E}(X^2 \mid \mathcal{H}) - \bigl(\operatorname{E}(X \mid \mathcal{H})\bigr)^2</math> ** [[Law of total variance]]: <math>\operatorname{Var}(X) = \operatorname{E}(\operatorname{Var}(X \mid \mathcal{H})) + \operatorname{Var}(\operatorname{E}(X \mid \mathcal{H}))</math>. * [[Martingale convergence theorem|Martingale convergence]]: For a random variable <math>X</math>, that has finite expectation, we have <math>E(X\mid\mathcal{H}_n) \to E(X\mid\mathcal{H})</math>, if either <math>\mathcal{H}_1 \subset \mathcal{H}_2 \subset \dotsb</math> is an increasing series of sub-''Ο''-algebras and <math>\textstyle \mathcal{H} = \sigma(\bigcup_{n=1}^\infty \mathcal{H}_n)</math> or if <math>\mathcal{H}_1 \supset \mathcal{H}_2 \supset \dotsb</math> is a decreasing series of sub-''Ο''-algebras and <math>\textstyle \mathcal{H} = \bigcap_{n=1}^\infty \mathcal{H}_n</math>. * Conditional expectation as <math>L^2</math>-projection: If <math>X,Y</math> are in the [[Hilbert space]] of [[square-integrable]] real random variables (real random variables with finite second moment) then ** for <math>\mathcal{H}</math>-measurable <math>Y</math>, we have <math>E(Y(X - E(X\mid\mathcal{H}))) = 0</math>, i.e. the conditional expectation <math>E(X\mid\mathcal{H})</math> is in the sense of the [[Lp space|''L''<sup>2</sup>(''P'')]] scalar product the [[orthogonal projection]] from <math>X</math> to the [[linear subspace]] of <math>\mathcal{H}</math>-measurable functions. (This allows to define and prove the existence of the conditional expectation based on the [[Hilbert projection theorem]].) ** the mapping <math>X \mapsto \operatorname{E}(X\mid\mathcal{H})</math> is [[self-adjoint operator|self-adjoint]]: <math>\operatorname E(X \operatorname E(Y \mid \mathcal{H})) = \operatorname E\left(\operatorname E(X \mid \mathcal{H}) \operatorname E(Y \mid \mathcal{H})\right) = \operatorname E(\operatorname E(X \mid \mathcal{H}) Y)</math> * Conditioning is a [[Contraction (operator theory)|contractive]] projection of [[Lp space|''L''<sup>p</sup>]] spaces <math>L^p(\Omega, \mathcal{F}, P) \rightarrow L^p(\Omega, \mathcal{H}, P)</math>. I.e., <math>\operatorname{E}\big(|\operatorname{E}(X \mid\mathcal{H})|^p \big) \le \operatorname{E}\big(|X|^p\big)</math> for any ''p'' β₯ 1. * Doob's conditional independence property:<ref>{{Cite book|title=Foundations of Modern Probability|last=Kallenberg|first=Olav|publisher=Springer|year=2001|isbn=0-387-95313-2|edition=2nd|location=York, PA, USA|pages=110}}</ref> If <math>X,Y</math> are [[conditionally independent]] given <math>Z</math>, then <math>P(X \in B\mid Y,Z) = P(X \in B\mid Z)</math> (equivalently, <math>E(1_{\{X \in B\}}\mid Y,Z) = E(1_{\{X \in B\}} \mid Z)</math>).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)