Editing Bayes' theorem (section)

==Forms==

===Events===

====Simple form====
For events ''A'' and ''B'', provided that ''P''(''B'')&nbsp;≠&nbsp;0,

:<math>P(A| B) = \frac{P(B |  A) P(A)}{P(B)} . </math>

In many applications, for instance in [[Bayesian inference]], the event ''B'' is fixed in the discussion and we wish to consider the effect of its having been observed on our belief in various possible events ''A''. In such situations the denominator of the last expression, the probability of the given evidence ''B'', is fixed; what we want to vary is ''A''. Bayes' theorem shows that the posterior probabilities are [[proportionality (mathematics)|proportional]] to the numerator, so the last equation becomes:

:<math>P(A| B) \propto P(A) \cdot P(B| A) .</math>

In words, the posterior is proportional to the prior times the likelihood. This version of Bayes' theorem is known as Bayes' rule.<ref>

{{Cite book
 |last=Lee
 |first=Peter M.
 |title=Bayesian Statistics
 |chapter-url=http://www-users.york.ac.uk/~pml1/bayes/book.htm
 |publisher=[[John Wiley & Sons|Wiley]]
 |year=2012
 |isbn=978-1-1183-3257-3 <!-- |isbn=978-1-1183-5977-8 -->
 |chapter=Chapter 1
}}
</ref>.

If events ''A''<sub>1</sub>, ''A''<sub>2</sub>, ..., are mutually exclusive and exhaustive, i.e., one of them is certain to occur but no two can occur together, we can determine the proportionality constant by using the fact that their probabilities must add up to one. For instance, for a given event ''A'', the event ''A'' itself and its complement ¬''A'' are exclusive and exhaustive. Denoting the constant of proportionality by ''c'', we have:

:<math>P(A| B) = c \cdot P(A) \cdot P(B| A) \text{ and } P(\neg A| B) = c \cdot P(\neg A) \cdot P(B| \neg A). </math>

Adding these two formulas we deduce that:

:<math> 1 = c \cdot (P(B| A)\cdot P(A) + P(B| \neg A) \cdot P(\neg A)),</math>

or

:<math> c = \frac{1}{P(B| A)\cdot P(A) + P(B| \neg A) \cdot P(\neg A)}  = \frac 1 {P(B)}. </math>

====Alternative form====

{| class="wikitable floatright"
|+ [[Contingency table]]
! {{diagonal split header|<br />Proposition|&nbsp;&nbsp;Background}} !! B !! {{tmath|\lnot B}}<br />(not {{mvar|B}}) !! Total
|-
|-
! {{mvar|A}}
| |<math>P(B|A)\cdot P(A)</math><br /><math>= P(A|B)\cdot P(B)</math> || |<math>P(\neg B|A)\cdot P(A)</math><br /><math>= P(A|\neg B)\cdot P(\neg B)</math>
|style="text-align:center;"| {{tmath|P(A)}}
|-
! {{tmath|\neg A}}<br/>(not {{mvar|A}})
| nowrap|<math>P(B|\neg A)\cdot P(\neg A)</math><br /><math>= P(\neg A|B)\cdot P(B)</math> || nowrap|<math>P(\neg B|\neg A)\cdot P(\neg A)</math><br /><math>= P(\neg A|\neg B)\cdot P(\neg B)</math> || nowrap|<math>P(\neg A)</math>=<br /><math>1-P(A)</math>
|-
| colspan="5" style="padding:0;"|
|-
! Total
| style="text-align:center;" | {{tmath|P(B)}}
| style="text-align:center;" | <math>P(\neg B) = 1-P(B)</math>
| style="text-align:center;" | 1
|}
Another form of Bayes' theorem for two competing statements or hypotheses is:

:<math>P(A| B) = \frac{P(B| A) P(A)}{ P(B| A) P(A) + P(B| \neg A) P(\neg A)}.</math>

For an epistemological interpretation:

For proposition ''A'' and evidence or background ''B'',<ref>{{cite web|title=Bayes' Theorem: Introduction|url=http://www.trinity.edu/cbrown/bayesweb/|website=Trinity University|url-status=dead|archive-url=https://web.archive.org/web/20040821012342/http://www.trinity.edu/cbrown/bayesweb/|archive-date=21 August 2004|access-date=5 August 2014}}</ref>

* <math>P(A)</math> is the [[prior probability]], the initial degree of belief in ''A''.
* <math>P(\neg A)</math> is the corresponding initial degree of belief in ''not-A'', that ''A'' is false, where <math> P(\neg A) =1-P(A) </math>
* <math>P(B| A)</math> is the [[conditional probability]] or likelihood, the degree of belief in ''B'' given that ''A'' is true.
* <math>P(B|\neg A)</math> is the [[conditional probability]] or likelihood, the degree of belief in ''B'' given that ''A'' is false.
* <math>P(A| B)</math> is the [[posterior probability]], the probability of ''A'' after taking into account ''B''.

====Extended form====
Often, for some [[partition of a set|partition]] {''A<sub>j</sub>''} of the [[sample space]], the [[Sample space|event space]] is given in terms of ''P''(''A<sub>j</sub>'') and ''P''(''B''&nbsp;|&nbsp;''A<sub>j</sub>''). It is then useful to compute ''P''(''B'') using the [[law of total probability]]:

<math>P(B)=\sum_{j}P(B \cap A_j),</math>

Or (using the multiplication rule for conditional probability),<ref>{{Cite web |title=Bayes Theorem - Formula, Statement, Proof {{!}} Bayes Rule |url=https://www.cuemath.com/data/bayes-theorem/ |access-date=2023-10-20 |website=Cuemath |language=en}}</ref>

:<math>P(B) = {\sum_j P(B| A_j) P(A_j)},</math>
:<math>\Rightarrow P(A_i| B) = \frac{P(B| A_i) P(A_i)}{\sum\limits_j P(B| A_j) P(A_j)}\cdot</math>

In the special case where ''A'' is a [[binary variable]]:

:<math>P(A| B) = \frac{P(B| A) P(A)}{ P(B| A) P(A) + P(B| \neg A) P(\neg A)}\cdot</math>

===Random variables===
[[File:Bayes continuous diagram.svg|thumb|Bayes' theorem applied to an event space generated by continuous random variables ''X'' and ''Y'' with known probability distributions. There exists an instance of Bayes' theorem for each point in the [[Domain of a function|domain]]. In practice, these instances might be parametrized by writing the specified probability densities as a [[Function (Mathematics)|function]] of ''x'' and ''y''.]]
Consider a [[sample space]] Ω generated by two [[random variables]] ''X'' and ''Y'' with known probability distributions. In principle, Bayes' theorem applies to the events ''A''&nbsp;=&nbsp;{''X''&nbsp;=&nbsp;''x''} and ''B''&nbsp;=&nbsp;{''Y''&nbsp;=&nbsp;''y''}.

:<math>P( X{=}x  | Y {=} y) = \frac{P(Y{=}y | X{=}x) P(X{=}x)}{P(Y{=}y)}</math>

Terms become 0 at points where either variable has finite [[probability density function|probability density]]. To remain useful, Bayes' theorem can be formulated in terms of the relevant densities (see [[#Derivation|Derivation]]).

====Simple form====
If ''X'' is continuous and ''Y'' is discrete,

:<math>f_{X | Y{=}y}(x) = \frac{P(Y{=}y| X{=}x) f_X(x)}{P(Y{=}y)}</math>

where each <math>f</math> is a density function.

If ''X'' is discrete and ''Y'' is continuous,

:<math> P(X{=}x| Y{=}y) = \frac{f_{Y | X{=}x}(y) P(X{=}x)}{f_Y(y)}.</math>

If both ''X'' and ''Y'' are continuous,

:<math> f_{X| Y{=}y}(x) = \frac{f_{Y | X{=}x}(y) f_X(x)}{f_Y(y)}.</math>

====Extended form====
[[File:Continuous event space specification.svg|thumb|A way to conceptualize event spaces generated by continuous random variables X and Y]]

A continuous event space is often conceptualized in terms of the numerator terms. It is then useful to eliminate the denominator using the [[law of total probability]]. For ''f<sub>Y</sub>''(''y''), this becomes an integral:

:<math> f_Y(y) = \int_{-\infty}^\infty f_{Y| X = \xi}(y) f_X(\xi)\,d\xi .</math>

=== Bayes' rule in odds form ===
Bayes' theorem in [[odds|odds form]] is:{{cn|date=April 2025}}

:<math>O(A_1:A_2\vert B) = O(A_1:A_2) \cdot \Lambda(A_1:A_2\vert B) </math>

where

:<math>\Lambda(A_1:A_2\vert B) = \frac{P(B\vert A_1)}{P(B\vert A_2)}</math>

is called the [[Bayes factor]] or [[likelihood ratio]]. The odds between two events is simply the ratio of the probabilities of the two events. Thus:

:<math>O(A_1:A_2) = \frac{P(A_1)}{P(A_2)},</math>

:<math>O(A_1:A_2\vert  B) = \frac{P(A_1\vert  B)}{P(A_2\vert  B)},</math>

Thus the rule says that the posterior odds are the prior odds times the [[Bayes factor]]; in other words, the posterior is proportional to the prior times the likelihood.

In the special case that <math>A_1 = A</math> and <math>A_2 = \neg A</math>, one writes <math>O(A)=O(A:\neg A) =P(A)/(1-P(A))</math>, and uses a similar abbreviation for the Bayes factor and for the conditional odds. The odds on <math>A</math> is by definition the odds for and against <math>A</math>. Bayes' rule can then be written in the abbreviated form

:<math>O(A\vert B) = O(A)  \cdot \Lambda(A\vert B) ,</math>

or, in words, the posterior odds on <math>A</math> equals the prior odds on <math>A</math> times the likelihood ratio for <math>A</math> given information <math>B</math>. In short, '''posterior odds equals prior odds times likelihood ratio'''.

For example, if a medical test has a [[Sensitivity and specificity|sensitivity]] of 90% and a [[Sensitivity and specificity|specificity]] of 91%, then the positive Bayes factor is <math>\Lambda_+ = P(\text{True Positive})/P(\text{False Positive}) = 90\%/(100\%-91\%)=10</math>. Now, if the [[prevalence]] of this disease is 9.09%, and if we take that as the prior probability, then the prior odds is about 1:10. So after receiving a positive test result, the posterior odds of having the disease becomes 1:1, which means that the posterior probability of having the disease is 50%. If a second test is performed in serial testing, and that also turns out to be positive, then the posterior odds of having the disease becomes 10:1, which means a posterior probability of about 90.91%. The negative Bayes factor can be calculated to be 91%/(100%-90%)=9.1, so if the second test turns out to be negative, then the posterior odds of having the disease is 1:9.1, which means a posterior probability of about 9.9%.

The example above can also be understood with more solid numbers: assume the patient taking the test is from a group of 1,000 people, 91 of whom have the disease (prevalence of 9.1%). If all 1,000 take the test, 82 of those with the disease will get a true positive result (sensitivity of 90.1%), 9 of those with the disease will get a false negative result ([[False positives and false negatives|false negative rate]] of 9.9%), 827 of those without the disease will get a true negative result (specificity of 91.0%), and 82 of those without the disease will get a false positive result (false positive rate of 9.0%). Before taking any test, the patient's odds for having the disease is 91:909. After receiving a positive result, the patient's odds for having the disease is

:<math>\frac{91}{909}\times\frac{90.1\%}{9.0\%}=\frac{91\times90.1\%}{909\times9.0\%}=1:1</math>
which is consistent with the fact that there are 82 true positives and 82 false positives in the group of 1,000.