Editing Base rate fallacy (section)

==Examples==
===Example 1: Disease===
==== High-prevalence population ====
{| class="wikitable floatright" style="text-align:right;"
! Number<br />of people !! Infected !! Uninfected !! Total
|-
! Test<br />positive
| 400<br />(true positive) || 30<br />(false positive)
| 430
|-
! Test<br />negative
| 0<br />(false negative) || 570<br />(true negative)
| 570
|-
! Total 
| 400 || 600 
! 1000
|}
Imagine running an infectious disease test on a population ''A'' of 1,000 persons, of which 40% are infected. The test has a false positive rate of 5% (0.05) and a false negative rate of zero. The [[expected value|expected outcome]] of the 1,000 tests on population ''A'' would be:

{{block indent|Infected and test indicates disease ([[true positive]])
{{block indent|1=1000 × {{sfrac|40|100}} = 400 people would receive a true positive}}}}
{{block indent|Uninfected and test indicates disease (false positive)
{{block indent|1=1000 × {{sfrac|100 – 40|100}} × 0.05 = 30 people would receive a false positive}}
The remaining 570 tests are correctly negative.}}

So, in population ''A'', a person receiving a positive test could be over 93% confident ({{sfrac|400|30 + 400}}) that it correctly indicates infection.

====Low-prevalence population====
{| class="wikitable floatright" style="text-align:right;"
! Number<br />of people !! Infected !! Uninfected !! Total
|-
! Test<br />positive
| 20<br />(true positive) || 49<br />(false positive)
| 69
|-
! Test<br />negative
| 0<br />(false negative) || 931<br />(true negative)
| 931
|-
! Total 
| 20 || 980 
! 1000
|}
Now consider the same test applied to population ''B'', of which only 2% are infected. The expected outcome of 1000 tests on population ''B'' would be:

{{block indent|Infected and test indicates disease (true positive)
{{block indent|1=1000 × {{sfrac|2|100}} = 20 people would receive a true positive}}}}
{{block indent|Uninfected and test indicates disease (false positive)
{{block indent|1=1000 × {{sfrac|100 – 2|100}} × 0.05 = 49 people would receive a false positive}}
The remaining 931 tests are correctly negative.}}

In population ''B'', only 20 of the 69 total people with a positive test result are actually infected. So, the probability of actually being infected after one is told that one is infected is only 29% ({{sfrac|20|20 + 49}}) for a test that otherwise appears to be "95% accurate".

A tester with experience of group ''A'' might find it a paradox that in group ''B'', a result that had usually correctly indicated infection is now usually a false positive. The confusion of the [[posterior probability]] of infection with the [[prior probability]] of receiving a false positive is a natural [[fallacy|error]] after receiving a health-threatening test result.

===Example 2: Drunk drivers===
Imagine that a group of police officers have [[breathalyzer]]s displaying false drunkenness in 5% of the cases in which the driver is sober. However, the breathalyzers never fail to detect a truly drunk person. One in a thousand drivers is driving drunk. Suppose the police officers then stop a driver at random to administer a breathalyzer test. It indicates that the driver is drunk. No other information is known about them.

Many would estimate the probability that the driver is drunk as high as 95%, but the correct probability is about 2%.

An explanation for this is as follows: on average, for every 1,000 drivers tested,
* 1 driver is drunk, and it is 100% certain that for that driver there is a ''true'' positive test result, so there is 1 ''true'' positive test result
* 999 drivers are not drunk, and among those drivers there are 5% ''false'' positive test results, so there are 49.95 ''false'' positive test results
Therefore, the probability that any given driver among the 1 + 49.95 = 50.95 positive test results really is drunk is <math>1/50.95 \approx 0.019627</math>.

The validity of this result does, however, hinge on the validity of the initial assumption that the police officer stopped the driver truly at random, and not because of bad driving. If that or another non-arbitrary reason for stopping the driver was present, then the calculation also involves the probability of a drunk driver driving competently and a non-drunk driver driving (in-)competently.

More formally, the same probability of roughly 0.02 can be established using [[Bayes' theorem]]. The goal is to find the probability that the driver is drunk given that the breathalyzer indicated they are drunk, which can be represented as

<math display="block">p(\mathrm{drunk}\mid D)</math>

where ''D'' means that the breathalyzer indicates that the driver is drunk. Using Bayes's theorem,

<math display="block">p(\mathrm{drunk}\mid D) = \frac{p(D \mid \mathrm{drunk})\, p(\mathrm{drunk})}{p(D)}.</math>

The following information is known in this scenario:

<math display="block">\begin{align}
p(\mathrm{drunk}) &= 0.001,\\
p(\mathrm{sober}) &= 0.999,\\
p(D\mid\mathrm{drunk}) &= 1.00,\\
p(D\mid\mathrm{sober}) &= 0.05.
\end{align}</math>

As can be seen from the formula, one needs ''p''(''D'') for Bayes' theorem, which can be computed from the preceding values using the [[law of total probability]]:

<math display="block">p(D) = p(D \mid \mathrm{drunk})\,p(\mathrm{drunk})+p(D\mid\mathrm{sober})\,p(\mathrm{sober})</math>

which gives

<math display="block">p(D)= (1.00 \times 0.001) + (0.05 \times 0.999) = 0.05095.</math>

Plugging these numbers into Bayes' theorem, one finds that

<math display="block">p(\mathrm{drunk}\mid D) = \frac{1.00 \times 0.001}{0.05095} \approx 0.019627,</math>

which is the precision of the test.

===Example 3: Terrorist identification===
In a city of 1 million inhabitants, let there be 100 terrorists and 999,900 non-terrorists. To simplify the example, it is assumed that all people present in the city are inhabitants. Thus, the base rate probability of a randomly selected inhabitant of the city being a terrorist is 0.0001, and the base rate probability of that same inhabitant being a non-terrorist is 0.9999. In an attempt to catch the terrorists, the city installs an alarm system with a surveillance camera and automatic [[facial recognition software]].

The software has two failure rates of 1%:
* The false negative rate: If the camera scans a terrorist, a bell will ring 99% of the time, and it will fail to ring 1% of the time.
* The false positive rate: If the camera scans a non-terrorist, a bell will not ring 99% of the time, but it will ring 1% of the time.

Suppose now that an inhabitant triggers the alarm. Someone making the base rate fallacy would infer that there is a 99% probability that the detected person is a terrorist. Although the inference seems to make sense, it is actually bad reasoning, and a calculation below will show that the probability of a terrorist is actually near 1%, not near 99%.

The fallacy arises from confusing the natures of two different failure rates. The 'number of non-bells per 100 terrorists' (P(¬B | T), or the probability that the bell fails to ring given the inhabitant is a terrorist) and the 'number of non-terrorists per 100 bells' (P(¬T | B), or the probability that the inhabitant is a non-terrorist given the bell rings) are unrelated quantities; one is not necessarily equal—or even close—to the other. To show this, consider what happens if an identical alarm system were set up in a second city with no terrorists at all. As in the first city, the alarm sounds for 1 out of every 100 non-terrorist inhabitants detected, but unlike in the first city, the alarm never sounds for a terrorist. Therefore, 100% of all occasions of the alarm sounding are for non-terrorists, but a false negative rate cannot even be calculated. The 'number of non-terrorists per 100 bells' in that city is 100, yet P(T | B) = 0%. There is zero chance that a terrorist has been detected given the ringing of the bell.

Imagine that the first city's entire population of one million people pass in front of the camera. About 99 of the 100 terrorists will trigger the alarm—and so will about 9,999 of the 999,900 non-terrorists. Therefore, about 10,098 people will trigger the alarm, among which about 99 will be terrorists. The probability that a person triggering the alarm actually is a terrorist is only about 99 in 10,098, which is less than 1% and very, very far below the initial guess of 99%.

The base rate fallacy is so misleading in this example because there are many more non-terrorists than terrorists, and the number of false positives (non-terrorists scanned as terrorists) is so much larger than the true positives (terrorists scanned as terrorists).

Multiple practitioners have argued that as the base rate of terrorism is extremely low, using [[data mining]] and predictive algorithms to identify terrorists cannot feasibly work due to the false positive paradox.<ref name=":0">{{Cite journal |last=Munk |first=Timme Bisgaard |date=1 September 2017 |title=100,000 false positives for every real terrorist: Why anti-terror algorithms don't work |url=https://firstmonday.org/ojs/index.php/fm/article/view/7126 |journal=First Monday |volume=22 |issue=9 |doi=10.5210/fm.v22i9.7126 |doi-access=free}}</ref><ref name=":1">{{Cite magazine |last=Schneier |first=Bruce |author-link=Bruce Schneier |title=Why Data Mining Won't Stop Terror |url=https://www.wired.com/2006/03/why-data-mining-wont-stop-terror-2/ |access-date=2022-08-30 |magazine=Wired |language=en-US |issn=1059-1028}}</ref><ref name=":2">{{Cite web |last1=Jonas |first1=Jeff |last2=Harper |first2=Jim |date=2006-12-11 |title=Effective Counterterrorism and the Limited Role of Predictive Data Mining |url=https://www.cato.org/policy-analysis/effective-counterterrorism-limited-role-predictive-data-mining# |access-date=2022-08-30 |website=[[Cato Institute]]}}</ref><ref name=":3">{{Cite journal |last=Sageman |first=Marc |author-link=Marc Sageman |date=2021-02-17 |title=The Implication of Terrorism's Extremely Low Base Rate |url=https://doi.org/10.1080/09546553.2021.1880226 |journal=Terrorism and Political Violence |volume=33 |issue=2 |pages=302–311 |doi=10.1080/09546553.2021.1880226 |issn=0954-6553 |s2cid=232341781}}</ref> Estimates of the number of false positives for each accurate result vary from over ten thousand<ref name=":3" /> to one billion;<ref name=":1" /> consequently, investigating each lead would be cost- and time-prohibitive.<ref name=":0" /><ref name=":2" /> The level of accuracy required to make these models viable is likely unachievable. Foremost, the low base rate of terrorism also means there is a lack of data with which to make an accurate algorithm.<ref name=":2" /> Further, in the context of detecting terrorism false negatives are highly undesirable and thus must be minimised as much as possible; however, this requires [[Sensitivity and specificity|increasing sensitivity at the cost of specificity]], increasing false positives.<ref name=":3" /> It is also questionable whether the use of such models by law enforcement would meet the requisite [[Burden of proof (law)|burden of proof]] given that over 99% of results would be false positives.<ref name=":3" />

===Example 4: biological testing of a suspect===
A crime is committed. Forensic analysis determines that the perpetrator has a certain blood type shared by 10% of the population. A suspect is arrested, and found to have that same blood type.

A prosecutor might charge the suspect with the crime on that basis alone, and claim at trial that the probability that the defendant is guilty is 90%.

However, this conclusion is only close to correct if the defendant was selected as the main suspect based on robust evidence discovered prior to the blood test and unrelated to it. Otherwise, the reasoning presented is flawed, as it overlooks the high [[prior probability]] (that is, prior to the blood test) that he is a random innocent person. Assume, for instance, that 1000 people live in the town where the crime occurred. This means that 100 people live there who have the perpetrator's blood type, of whom only one is the true perpetrator; therefore, the true probability that the defendant is guilty – based only on the fact that his blood type matches that of the killer – is only 1%, far less than the 90% argued by the prosecutor.

The prosecutor's fallacy involves assuming that the prior probability of a random match is equal to the probability that the defendant is innocent. When using it, a prosecutor questioning an expert witness may ask: "The odds of finding this evidence on an innocent man are so small that the jury can safely disregard the possibility that this defendant is innocent, correct?"<ref>{{Cite journal |last1=Fenton |first1=Norman |last2=Neil |first2=Martin |last3=Berger |first3=Daniel |date=June 2016 |title=Bayes and the Law |journal=[[Annual Review of Statistics and Its Application]] |volume=3 |issue=1 |pages=51–77 |bibcode=2016AnRSA...3...51F |doi=10.1146/annurev-statistics-041715-033428 |pmc=4934658 |pmid=27398389}}</ref> The claim assumes that the probability that evidence is found on an innocent man is the same as the probability that a man is innocent given that evidence was found on him, which is not true. Whilst the former is usually small (10% in the previous example) due to good [[forensic evidence]] procedures, the latter (99% in that example) does not directly relate to it and will often be much higher, since, in fact, it depends on the likely quite high [[prior odds]] of the defendant being a random innocent person.