Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Base rate fallacy
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Findings in psychology== In experiments, people have been found to prefer individuating information over general information when the former is available.<ref>{{cite journal |last=Bar-Hillel |first=Maya |author-link=Maya Bar-Hillel |year=1980 |title=The base-rate fallacy in probability judgments |url=http://ratio.huji.ac.il/sites/default/files/publications/dp732.pdf |journal=Acta Psychologica |volume=44 |issue=3 |pages=211–233 |doi=10.1016/0001-6918(80)90046-3}}</ref><ref name="kv1">{{cite journal |last=Kahneman |first=Daniel |author2=Amos Tversky |year=1973 |title=On the psychology of prediction |journal=Psychological Review |volume=80 |issue=4 |pages=237–251 |doi=10.1037/h0034747 |s2cid=17786757}}</ref><ref>{{cite journal |last1=Tversky |first1=Amos |last2=Kahneman |first2=Daniel |date=1974-09-27 |title=Judgment under uncertainty: Heuristics and biases |journal=Science |volume=185 |issue=4157 |pages=1124–1131 |bibcode=1974Sci...185.1124T |doi=10.1126/science.185.4157.1124 |pmid=17835457 |s2cid=143452957}}</ref> In some experiments, students were asked to estimate the [[grade point average]]s (GPAs) of hypothetical students. When given relevant statistics about GPA distribution, students tended to ignore them if given descriptive information about the particular student even if the new descriptive information was obviously of little or no relevance to school performance.<ref name="kv1" /> This finding has been used to argue that interviews are an unnecessary part of the [[college admission]]s process because interviewers are unable to pick successful candidates better than basic statistics. Psychologists [[Daniel Kahneman]] and [[Amos Tversky]] attempted to explain this finding in terms of a [[heuristics in judgment and decision making|simple rule or "heuristic"]] called [[representativeness heuristic|representativeness]]. They argued that many judgments relating to likelihood, or to cause and effect, are based on how representative one thing is of another, or of a category.<ref name="kv1" /> Kahneman considers base rate neglect to be a specific form of [[extension neglect]].<ref>{{cite book |last=Kahneman |first=Daniel |title=Choices, Values and Frames |year=2000 |isbn=0-521-62749-4 |editor=Daniel Kahneman and Amos Tversky |chapter=Evaluation by moments, past and future}}</ref> [[Richard Nisbett]] has argued that some [[attributional bias]]es like the [[fundamental attribution error]] are instances of the base rate fallacy: people do not use the "consensus information" (the "base rate") about how others behaved in similar situations and instead prefer simpler [[dispositional attribution]]s.<ref>{{cite book |last=Nisbett |first=Richard E. |title=Cognition and social behavior |author2=E. Borgida |author3=R. Crandall |author4=H. Reed |publisher=John Wiley & Sons, Incorporated |year=1976 |isbn=0-470-99007-4 |editor=J. S. Carroll & J. W. Payne |volume=2 |pages=227–236 |chapter=Popular induction: Information is not always informative}}</ref> There is considerable debate in psychology on the conditions under which people do or do not appreciate base rate information.<ref name="Koehler1996">{{Cite journal |last1=Koehler |first1=J. J. |year=2010 |title=The base rate fallacy reconsidered: Descriptive, normative, and methodological challenges |journal=Behavioral and Brain Sciences |volume=19 |pages=1–17 |doi=10.1017/S0140525X00041157 |s2cid=53343238}}</ref><ref name="BarbeySloman2007">{{Cite journal |last1=Barbey |first1=A. K. |last2=Sloman |first2=S. A. |year=2007 |title=Base-rate respect: From ecological rationality to dual processes |journal=Behavioral and Brain Sciences |volume=30 |issue=3 |pages=241–254; discussion 255–297 |doi=10.1017/S0140525X07001653 |pmid=17963533 |s2cid=31741077}}</ref> Researchers in the heuristics-and-biases program have stressed empirical findings showing that people tend to ignore base rates and make inferences that violate certain norms of probabilistic reasoning, such as [[Bayes' theorem]]. The conclusion drawn from this line of research was that human probabilistic thinking is fundamentally flawed and error-prone.<ref name="TverskyKahneman1974">{{Cite journal |last1=Tversky |first1=A. |last2=Kahneman |first2=D. |year=1974 |title=Judgment under Uncertainty: Heuristics and Biases |journal=Science |volume=185 |issue=4157 |pages=1124–1131 |bibcode=1974Sci...185.1124T |doi=10.1126/science.185.4157.1124 |pmid=17835457 |s2cid=143452957}}</ref> Other researchers have emphasized the link between cognitive processes and information formats, arguing that such conclusions are not generally warranted.<ref>{{cite journal |last=Cosmides |first=Leda |author2=John Tooby |year=1996 |title=Are humans good intuitive statisticians after all? Rethinking some conclusions of the literature on judgment under uncertainty |journal=Cognition |volume=58 |pages=1–73 |citeseerx=10.1.1.131.8290 |doi=10.1016/0010-0277(95)00664-8 |s2cid=18631755}}</ref><ref name="GigerenzerHoffrage1995">{{Cite journal |last1=Gigerenzer |first1=G. |last2=Hoffrage |first2=U. |year=1995 |title=How to improve Bayesian reasoning without instruction: Frequency formats |journal=Psychological Review |volume=102 |issue=4 |page=684 |citeseerx=10.1.1.128.3201 |doi=10.1037/0033-295X.102.4.684 |s2cid=16281385}}</ref> Consider again Example 2 from above. The required inference is to estimate the (posterior) probability that a (randomly picked) driver is drunk, given that the breathalyzer test is positive. Formally, this probability can be calculated using Bayes' theorem, as shown above. However, there are different ways of presenting the relevant information. Consider the following, formally equivalent variant of the problem: : 1 out of 1000 drivers are driving drunk. The breathalyzers never fail to detect a truly drunk person. For 50 out of the 999 drivers who are not drunk the breathalyzer falsely displays drunkenness. Suppose the policemen then stop a driver at random, and force them to take a breathalyzer test. It indicates that they are drunk. No other information is known about them. Estimate the probability the driver is really drunk. In this case, the relevant numerical information—''p''(drunk), ''p''(''D'' | drunk), ''p''(''D'' | sober)—is presented in terms of natural frequencies with respect to a certain reference class (see [[reference class problem]]). Empirical studies show that people's inferences correspond more closely to Bayes' rule when information is presented this way, helping to overcome base-rate neglect in laypeople<ref name="GigerenzerHoffrage1995" /> and experts.<ref name="Hoffrage2000">{{Cite journal |last1=Hoffrage |first1=U. |last2=Lindsey |first2=S. |last3=Hertwig |first3=R. |last4=Gigerenzer |first4=G. |year=2000 |title=Medicine: Communicating Statistical Information |journal=Science |volume=290 |issue=5500 |pages=2261–2262 |doi=10.1126/science.290.5500.2261 |pmid=11188724 |s2cid=33050943 |hdl-access=free |hdl=11858/00-001M-0000-0025-9B18-3}}</ref> As a consequence, organizations like the [[Cochrane Collaboration]] recommend using this kind of format for communicating health statistics.<ref name="Cochrane2011">{{Cite journal |last1=Akl |first1=E. A. |last2=Oxman |first2=A. D. |last3=Herrin |first3=J. |last4=Vist |first4=G. E. |last5=Terrenato |first5=I. |last6=Sperati |first6=F. |last7=Costiniuk |first7=C. |last8=Blank |first8=D. |last9=Schünemann |first9=H. |year=2011 |editor1-last=Schünemann |editor1-first=Holger |title=Using alternative statistical formats for presenting risks and risk reductions |journal=The Cochrane Database of Systematic Reviews |volume=2011 |issue=3 |pages=CD006776 |doi=10.1002/14651858.CD006776.pub2 |pmc=6464912 |pmid=21412897}}</ref> Teaching people to translate these kinds of Bayesian reasoning problems into natural frequency formats is more effective than merely teaching them to plug probabilities (or percentages) into Bayes' theorem.<ref name="SedlmeierGigerenzer2002">{{Cite journal |last1=Sedlmeier |first1=P. |last2=Gigerenzer |first2=G. |year=2001 |title=Teaching Bayesian reasoning in less than two hours |url=http://edoc.mpg.de/175640 |journal=Journal of Experimental Psychology: General |volume=130 |issue=3 |pages=380–400 |doi=10.1037/0096-3445.130.3.380 |pmid=11561916 |s2cid=11147078 |hdl-access=free |hdl=11858/00-001M-0000-0025-9504-E}}</ref> It has also been shown that graphical representations of natural frequencies (e.g., icon arrays, hypothetical outcome plots) help people to make better inferences.<ref name="SedlmeierGigerenzer2002" /><ref name="Brase2008">{{Cite journal |last1=Brase |first1=G. L. |year=2009 |title=Pictorial representations in statistical reasoning |journal=Applied Cognitive Psychology |volume=23 |issue=3 |pages=369–381 |doi=10.1002/acp.1460 |s2cid=18817707}}</ref><ref name="Edwards2002">{{Cite journal |last1=Edwards |first1=A. |last2=Elwyn |first2=G. |last3=Mulley |first3=A. |year=2002 |title=Explaining risks: Turning numerical data into meaningful pictures |journal=BMJ |volume=324 |issue=7341 |pages=827–830 |doi=10.1136/bmj.324.7341.827 |pmc=1122766 |pmid=11934777}}</ref><ref>{{cite book |last1=Kim |first1=Yea-Seul |title=Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems |last2=Walls |first2=Logan A. |last3=Krafft |first3=Peter |last4=Hullman |first4=Jessica |date=2 May 2019 |isbn=9781450359702 |pages=1–14 |chapter=A Bayesian Cognition Approach to Improve Data Visualization |doi=10.1145/3290605.3300912 |chapter-url=https://idl.cs.washington.edu/papers/bayesian-cognition-vis/ |s2cid=57761146 |arxiv=1901.02949}}</ref> One important reason why natural frequency formats are helpful is that this information format facilitates the required inference because it simplifies the necessary calculations. This can be seen when using an alternative way of computing the required probability ''p''(drunk|''D''): :<math>p(\mathrm{drunk}\mid D) = \frac{N(\mathrm{drunk} \cap D)}{N(D)} = \frac{1}{51} = 0.0196</math> where ''N''(drunk ∩ ''D'') denotes the number of drivers that are drunk and get a positive breathalyzer result, and ''N''(''D'') denotes the total number of cases with a positive breathalyzer result. The equivalence of this equation to the above one follows from the axioms of probability theory, according to which ''N''(drunk ∩ ''D'') = ''N'' × ''p'' (''D'' | drunk) × ''p'' (drunk). Importantly, although this equation is formally equivalent to Bayes' rule, it is not psychologically equivalent. Using natural frequencies simplifies the inference because the required mathematical operation can be performed on natural numbers, instead of normalized fractions (i.e., probabilities), because it makes the high number of false positives more transparent, and because natural frequencies exhibit a "nested-set structure".<ref name="Girotto2001">{{Cite journal |last1=Girotto |first1=V. |last2=Gonzalez |first2=M. |year=2001 |title=Solving probabilistic and statistical problems: A matter of information structure and question form |journal=Cognition |volume=78 |issue=3 |pages=247–276 |doi=10.1016/S0010-0277(00)00133-5 |pmid=11124351 |s2cid=8588451}}</ref><ref name="Hoffrage2002">{{Cite journal |last1=Hoffrage |first1=U. |last2=Gigerenzer |first2=G. |last3=Krauss |first3=S. |last4=Martignon |first4=L. |year=2002 |title=Representation facilitates reasoning: What natural frequencies are and what they are not |journal=Cognition |volume=84 |issue=3 |pages=343–352 |doi=10.1016/S0010-0277(02)00050-1 |pmid=12044739 |s2cid=9595672}}</ref> Not every frequency format facilitates Bayesian reasoning.<ref name="Hoffrage2002" /><ref name="GigerenzerHoffrage1999">{{Cite journal |last1=Gigerenzer |first1=G. |last2=Hoffrage |first2=U. |year=1999 |title=Overcoming difficulties in Bayesian reasoning: A reply to Lewis and Keren (1999) and Mellers and McGraw (1999) |url=http://edoc.mpg.de/2936 |journal=Psychological Review |volume=106 |issue=2 |pages=425 |doi=10.1037/0033-295X.106.2.425 |hdl-access=free |hdl=11858/00-001M-0000-0025-9CB4-8}}</ref> Natural frequencies refer to frequency information that results from ''natural sampling'',<ref name="Kleiter1994">{{Cite book |last1=Kleiter |first1=G. D. |title=Contributions to Mathematical Psychology, Psychometrics, and Methodology |year=1994 |isbn=978-0-387-94169-1 |series=Recent Research in Psychology |pages=375–388 |chapter=Natural Sampling: Rationality without Base Rates |doi=10.1007/978-1-4612-4308-3_27}}</ref> which preserves base rate information (e.g., number of drunken drivers when taking a random sample of drivers). This is different from ''systematic sampling'', in which base rates are fixed ''a priori'' (e.g., in scientific experiments). In the latter case it is not possible to infer the posterior probability ''p''(drunk | positive test) from comparing the number of drivers who are drunk and test positive compared to the total number of people who get a positive breathalyzer result, because base rate information is not preserved and must be explicitly re-introduced using Bayes' theorem.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)