Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Conditional independence
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Rules of conditional independence== A set of rules governing statements of conditional independence have been derived from the basic definition.<ref>{{cite journal |first=A. P. |last=Dawid |authorlink=Philip Dawid |title=Conditional Independence in Statistical Theory |journal=[[Journal of the Royal Statistical Society, Series B]] |year=1979 |volume=41 |issue=1 |pages=1β31 |mr=0535541 |jstor=2984718 }}</ref><ref name=pearl:2000>J Pearl, Causality: Models, Reasoning, and Inference, 2000, Cambridge University Press</ref> These rules were termed "[[Graphoid]] Axioms" by Pearl and Paz,<ref name=pearl:paz85>{{cite conference | last1 = Pearl | first1 = Judea | author1-link = Judea Pearl | last2 = Paz | first2 = Azaria | editor1-last = du Boulay | editor1-first = Benedict | editor2-last = Hogg | editor2-first = David C. | editor3-last = Steels | editor3-first = Luc | contribution = Graphoids: Graph-Based Logic for Reasoning about Relevance Relations or When would x tell you more about y if you already know z? | pages = 357β363 | publisher = North-Holland | title = Advances in Artificial Intelligence II, Seventh European Conference on Artificial Intelligence, ECAI 1986, Brighton, UK, July 20β25, 1986, Proceedings | url = https://ftp.cs.ucla.edu/pub/stat_ser/r53-L.pdf | year = 1986}}</ref> because they hold in graphs, where <math>X \perp\!\!\!\perp A\mid B</math> is interpreted to mean: "All paths from ''X'' to ''A'' are intercepted by the set ''B''".<ref name=pearl:88>{{cite book|last1=Pearl|first1=Judea|title=Probabilistic reasoning in intelligent systems: networks of plausible inference|url=https://archive.org/details/probabilisticrea00pear|url-access=registration|date=1988|publisher=Morgan Kaufmann|isbn=9780934613736}}</ref> ===Symmetry=== : <math> X \perp\!\!\!\perp Y \mid Z \quad \Leftrightarrow \quad Y \perp\!\!\!\perp X \mid Z </math> '''Proof:''' From the definition of conditional independence, : <math> X \perp\!\!\!\perp Y \mid Z \quad \Leftrightarrow \quad P(X, Y \mid Z) = P(X \mid Z) P(Y \mid Z) \quad \Leftrightarrow \quad Y \perp\!\!\!\perp X \mid Z </math> ===Decomposition=== : <math> X \perp\!\!\!\perp Y \mid Z \quad \Rightarrow \quad h(X) \perp\!\!\!\perp Y \mid Z </math> '''Proof''' From the definition of conditional independence, we seek to show that: : <math> X \perp\!\!\!\perp Y \mid Z \quad \Rightarrow \quad P(h(X), Y \mid Z) = P(h(X) \mid Z) P(Y \mid Z) </math> . The left side of this equality is: : <math> P(h(X)=a, Y=y \mid Z=z) = \sum_{X \colon h(X)=a} P(X=x, Y=y \mid Z=z) </math> , where the expression on the right side of this equality is the summation over <math>X</math> such that <math>h(X)=a</math> of the conditional probability of <math>X, Y</math> on <math>Z</math>. Further decomposing, : <math> \begin{align} \sum_{X \colon h(X)=a} P(X=x, Y=y \mid Z=z) =& \sum_{X \colon h(X)=a} P(X=x \mid Z=z) P(Y=y \mid Z=z) \\ =& P(Y=y \mid Z=z) \sum_{X \colon h(X)=a} P(X=x \mid Z=z) \\ =& P(Y \mid Z) P (h(X) \mid Z) \end{align} </math> . Special cases of this property include * <math> (X, W) \perp\!\!\!\perp Y \mid Z \quad \Rightarrow \quad X \perp\!\!\!\perp Y \mid Z </math> ** '''Proof:''' Let us define <math> A = (X,W) </math> and <math> h(\cdot) </math> be an 'extraction' function <math> h(X,W) = X</math>. Then: : <math> \begin{align} (X,W) \perp\!\!\!\perp Y \mid Z \quad &\Leftrightarrow \quad A \perp\!\!\!\perp Y \mid Z \\ &\Rightarrow \quad h(A) \perp\!\!\!\perp Y \mid Z \quad &\text{Decomposition} \\ &\Leftrightarrow \quad X \perp\!\!\!\perp Y \mid Z \end{align} </math> * <math> X \perp\!\!\!\perp (Y, W) \mid Z \quad \Rightarrow \quad X \perp\!\!\!\perp Y \mid Z </math> ** '''Proof:''' Let us define <math> V = (Y,W) </math> and <math> h(\cdot) </math> be again an 'extraction' function <math> h(Y,W) = Y</math>. Then: : <math> \begin{align} X \perp\!\!\!\perp (Y,W) \mid Z \quad &\Leftrightarrow \quad X \perp\!\!\!\perp V \mid Z \\ &\Leftrightarrow \quad V \perp\!\!\!\perp X \mid Z \quad &\text{Symmetry} \\ &\Rightarrow \quad h(V) \perp\!\!\!\perp X \mid Z \quad &\text{Decomposition} \\ &\Leftrightarrow \quad Y \perp\!\!\!\perp X \mid Z \\ &\Leftrightarrow \quad X \perp\!\!\!\perp Y \mid Z \quad &\text{Symmetry} \end{align} </math> ===Weak union=== : <math> X \perp\!\!\!\perp Y \mid Z \quad \Rightarrow \quad X \perp\!\!\!\perp Y \mid (Z, h(X)) </math> '''Proof:''' Given <math> X \perp\!\!\!\perp Y \mid Z </math>, we aim to show : <math> \begin{align} X \perp\!\!\!\perp Y \mid (Z, h(X)) \quad &\Leftrightarrow \quad X \perp\!\!\!\perp Y \mid U \quad &\text{where} \quad U = (Z, h(X)) \\ &\Leftrightarrow \quad Y \perp\!\!\!\perp X \mid U \quad &\text{Symmetry} \\ &\Leftrightarrow \quad P(Y\mid X, U) = P(Y\mid U) \\ &\Leftrightarrow \quad P(Y \mid X, Z, h(X)) = P(Y \mid Z, h(X)) \end{align} </math> . We begin with the left side of the equation : <math> \begin{align} P(Y \mid X, Z, h(X)) &= P(Y \mid X, Z) \\ &= P(Y \mid Z) &\text{Since by symmetry } Y \perp\!\!\!\perp X \mid Z \end{align} </math> . From the given condition : <math> \begin{align} X \perp\!\!\!\perp Y \mid Z \quad &\Rightarrow \quad h(X) \perp\!\!\!\perp Y \mid Z \quad &\text{Decomposition} \\ &\Leftrightarrow \quad Y \perp\!\!\!\perp h(X) \mid Z \quad &\text{Symmetry} \\ &\Rightarrow \quad P(Y \mid Z, h(X)) = P(Y \mid Z) \end{align} </math> . Thus <math> P(Y \mid X, Z, h(X)) = P(Y \mid Z, h(X)) </math>, so we have shown that <math> X \perp\!\!\!\perp Y \mid (Z, h(X)) </math>. '''Special Cases:''' Some textbooks present the property as * <math> X \perp\!\!\!\perp (Y, W) \mid Z \quad \Rightarrow \quad X \perp\!\!\!\perp Y \mid (Z, W) </math> <ref name="Koller">{{cite book |last1=Koller |first1=Daphne |last2=Friedman |first2=Nir |title=Probabilistic Graphical Models |date=2009 |publisher=The MIT Press |location=Cambridge, MA |isbn=9780262013192}}</ref>. * <math> (X,W) \perp\!\!\!\perp Y \mid Z \quad \Rightarrow \quad X \perp\!\!\!\perp Y \mid (Z,W) </math>. Both versions can be shown to follow from the weak union property given initially via the same method as in the decomposition section above. ===Contraction=== : <math> \left.\begin{align} X \perp\!\!\!\perp A \mid B \\ X \perp\!\!\!\perp B \end{align}\right\}\text{ and } \quad \Rightarrow \quad X \perp\!\!\!\perp A,B </math> '''Proof''' This property can be proved by noticing <math>\Pr(X\mid A,B) = \Pr(X\mid B) = \Pr(X)</math>, each equality of which is asserted by <math>X \perp\!\!\!\perp A \mid B</math> and <math>X \perp\!\!\!\perp B</math>, respectively. ===Intersection=== For strictly positive probability distributions,<ref name=pearl:2000 /> the following also holds: : <math> \left.\begin{align} X \perp\!\!\!\perp Y \mid Z, W\\ X \perp\!\!\!\perp W \mid Z, Y \end{align}\right\}\text{ and } \quad \Rightarrow \quad X \perp\!\!\!\perp W, Y \mid Z </math> '''Proof''' By assumption: : <math>P(X|Z, W, Y) = P(X|Z, W) \land P(X|Z, W, Y) = P(X|Z, Y) \implies P(X|Z, Y) = P(X|Z, W)</math> Using this equality, together with the [[Law of total probability]] applied to <math>P(X|Z)</math>: : <math>\begin{align} P(X|Z) &= \sum_{w \in W} P(X|Z, W=w)P(W=w|Z) \\[4pt] &= \sum_{w \in W} P(X|Y, Z)P(W=w|Z) \\[4pt] &= P(X|Z, Y) \sum_{w \in W} P(W=w|Z) \\[4pt] &= P(X|Z, Y) \end{align}</math> Since <math>P(X|Z, W, Y) = P(X|Z, Y)</math> and <math>P(X|Z, Y) = P(X|Z)</math>, it follows that <math>P(X|Z, W, Y) = P(X|Z) \iff X \perp\!\!\!\perp Y,W | Z</math>. Technical note: since these implications hold for any probability space, they will still hold if one considers a sub-universe by conditioning everything on another variable, say ''K''. For example, <math>X \perp\!\!\!\perp Y \Rightarrow Y \perp\!\!\!\perp X</math> would also mean that <math>X \perp\!\!\!\perp Y \mid K \Rightarrow Y \perp\!\!\!\perp X \mid K</math>.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)