Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Simpson's paradox
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Error in statistical reasoning with groups}} [[Image:Simpson's paradox continuous.svg|thumb|Simpson's paradox for quantitative data: a positive trend (<span style="display:inline-block; vertical-align:middle; width:2em; height:0px; border-style:none; border-top:2px solid blue;"> </span>, <span style="display:inline-block; vertical-align:middle; width:2em; height:0px; border-style:none; border-top:2px solid red;"> </span>) appears for two separate groups, whereas a negative trend (<span style="display:inline-block; vertical-align:middle; width:2em; height:0px; border-style:none; border-top:2px dashed#000;"> </span>) appears when the groups are combined.]] [[File:Simpsons paradox - animation.gif|thumb|Visualization of Simpson's paradox on data resembling real-world variability indicates that risk of misjudgment of true causal relationship can be hard to spot.]] '''Simpson's paradox''' is a phenomenon in [[probability]] and [[statistics]] in which a trend appears in several groups of data but disappears or reverses when the groups are combined. This result is often encountered in social-science and medical-science statistics,<ref> {{cite journal | title = Simpson's Paradox in Real Life | author = Clifford H. Wagner |date=February 1982 | journal = [[The American Statistician]] | volume = 36 | issue = 1 | pages = 46β48 | doi = 10.2307/2684093 | jstor = 2684093 }}</ref><ref>Holt, G. B. (2016). [http://jco.ascopubs.org/content/34/9/1016.1.full Potential Simpson's paradox in multicenter study of intraperitoneal chemotherapy for ovarian cancer.] Journal of Clinical Oncology, 34(9), 1016β1016.</ref><ref name="VogelFranks2017">{{cite journal|last1=Franks|first1=Alexander|author-link2=Edoardo Airoldi|last2=Airoldi|first2=Edoardo|last3=Slavov|first3=Nikolai|title=Post-transcriptional regulation across human tissues|journal=PLOS Computational Biology|volume=13|issue=5|year=2017|pages=e1005535|issn=1553-7358|doi=10.1371/journal.pcbi.1005535|pmid=28481885|pmc=5440056|arxiv=1506.00219|bibcode=2017PLSCB..13E5535F |doi-access=free }}</ref> and is particularly problematic when frequency data are unduly given [[causal]] interpretations.<ref name="pearl">[[Judea Pearl]]. ''Causality: Models, Reasoning, and Inference'', Cambridge University Press (2000, 2nd edition 2009). {{isbn|0-521-77362-8}}.</ref> The paradox can be resolved when [[confounding variable]]s and causal relations are appropriately addressed in the statistical modeling<ref name="pearl" /><ref>Kock, N., & Gaskins, L. (2016). [http://cits.tamiu.edu/kock/pubs/journals/2016JournalIJANS_ModJCveNetCorrp/Kock_Gaskins_2016_IJANS_SimpPdox.pdf Simpson's paradox, moderation and the emergence of quadratic relationships in path models: An information systems illustration.] International Journal of Applied Nonlinear Science, 2(3), 200β234.</ref> (e.g., through [[cluster analysis]]).<ref>Rogier A. Kievit, Willem E. Frankenhuis, Lourens J. Waldorp and Denny Borsboom, Simpson's paradox in psychological science: a practical guide https://doi.org/10.3389/fpsyg.2013.00513</ref> Simpson's paradox has been used to illustrate the kind of misleading results that the [[misuse of statistics]] can generate.<ref>Robert L. Wardrop (February 1995). "Simpson's Paradox and the Hot Hand in Basketball". ''The American Statistician'', ''' 49 (1)''': pp. 24β28.</ref><ref>[[Alan Agresti]] (2002). "Categorical Data Analysis" (Second edition). [[John Wiley and Sons]] {{isbn|0-471-36093-7}}</ref> [[Edward H. Simpson]] first described this phenomenon in a technical paper in 1951;<ref> {{cite journal | title=The Interpretation of Interaction in Contingency Tables | author = Simpson, Edward H. | year = 1951 | journal = Journal of the Royal Statistical Society, Series B | volume = 13 | issue = 2 | pages = 238β241 | doi = 10.1111/j.2517-6161.1951.tb00088.x }}</ref> the statisticians [[Karl Pearson]] (in 1899)<ref> {{Cite journal | last1 = Pearson | first1 = Karl | author1-link = Karl Pearson | last2 = Lee | first2 = Alice | last3 = Bramley-Moore | first3 = Lesley | title = Genetic (reproductive) selection: Inheritance of fertility in man, and of fecundity in thoroughbred racehorses | journal = [[Philosophical Transactions of the Royal Society A]] | volume = 192 | pages = 257β330 | year = 1899 | doi = 10.1098/rsta.1899.0006 | doi-access = free }}</ref> and [[Udny Yule]] (in 1903)<ref name=yule> {{Cite journal | title = Notes on the Theory of Association of Attributes in Statistics | author = G. U. Yule | year = 1903 | journal = [[Biometrika]] | volume = 2 | pages = 121β134 | doi = 10.1093/biomet/2.2.121 | issue = 2 | url = https://zenodo.org/record/1431599 }}</ref> had mentioned similar effects earlier. The name ''Simpson's paradox'' was introduced by Colin R. Blyth in 1972.<ref name="blyth-72"> {{cite journal | title = On Simpson's Paradox and the Sure-Thing Principle | author = Colin R. Blyth | date=June 1972 | journal = Journal of the American Statistical Association | volume = 67 | issue = 338 | pages = 364β366 | doi = 10.2307/2284382 | jstor = 2284382 }}</ref> It is also referred to as '''Simpson's reversal''', the '''YuleβSimpson effect''', the '''amalgamation paradox''', or the '''reversal paradox'''.<ref> {{cite journal|author=[[I. J. Good]], [[Yash Mittal|Y. Mittal]]|date=June 1987|title=The Amalgamation and Geometry of Two-by-Two Contingency Tables|journal=[[The Annals of Statistics]]|volume=15|issue=2|pages=694β711|doi=10.1214/aos/1176350369|issn=0090-5364|jstor=2241334|doi-access=free}}</ref> Mathematician [[Jordan Ellenberg]] argues that Simpson's paradox is misnamed as "there's no contradiction involved, just two different ways to think about the same data" and suggests that its lesson "isn't really to tell us which viewpoint to take but to insist that we keep both the parts and the whole in mind at once."<ref>{{Cite book |last=Ellenberg |first=Jordan |url=https://www.worldcat.org/oclc/1226171979 |title=Shape: The Hidden Geometry of Information, Biology, Strategy, Democracy and Everything Else |date=May 25, 2021 |publisher=[[Penguin Press]] |isbn=978-1-9848-7905-9 |location=New York |pages=228 |oclc=1226171979}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)