Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sampling bias
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Historical examples== [[File:Acid2compliancebyusage.png|thumb|right|250px|Example of biased sample: as of June 2008 55% of web browsers ([[Internet Explorer]]) in use did not pass the [[Acid2]] test. Due to the nature of the test, the sample consisted mostly of web developers.<ref>{{cite web |url=https://www.w3schools.com/browsers/browsers_stats.asp |title=Browser Statistics |publisher=Refsnes Data |date=June 2008 |access-date=2008-07-05}}</ref>]] A classic example of a biased sample and the misleading results it produced occurred in 1936. In the early days of opinion polling, the American ''[[Literary Digest]]'' magazine collected over two million postal surveys and predicted that the Republican candidate in the [[1936 United States presidential election|U.S. presidential election]], [[Alf Landon]], would beat the incumbent president, [[Franklin Roosevelt]], by a large margin. The result was the exact opposite. The Literary Digest survey represented a sample collected from readers of the magazine, supplemented by records of registered automobile owners and telephone users. This sample included an over-representation of wealthy individuals, who, as a group, were more likely to vote for the Republican candidate. In contrast, a poll of only 50 thousand citizens selected by [[George Gallup]]'s organization successfully predicted the result, leading to the popularity of the [[Gallup poll]]. Another classic example occurred in the [[1948 United States presidential election|1948 presidential election]]. On election night, the [[Chicago Tribune]] printed the headline ''[[Dewey Defeats Truman|DEWEY DEFEATS TRUMAN]]'', which turned out to be mistaken. In the morning the grinning [[president-elect of the United States|president-elect]], [[Harry S. Truman]], was photographed holding a newspaper bearing this headline. The reason the Tribune was mistaken is that their editor trusted the results of a [[phone survey]]. Survey research was then in its infancy, and few academics realized that a sample of telephone users was not representative of the general population. Telephones were not yet widespread, and those who had them tended to be prosperous and have stable addresses. (In many cities, the [[Bell System]] [[telephone directory]] contained the same names as the [[Social Register]]). In addition, the Gallup poll that the Tribune based its headline on was over two weeks old at the time of the printing.<ref>{{cite web | vauthors = Lienhard JH | title = Gallup Poll | url = http://www.uh.edu/engines/epi1199.htm | access-date = 29 September 2007 | work = The Engines of Our Ingenuity }}</ref> In [[air quality]] data, pollutants (such as [[carbon monoxide]], [[nitrogen monoxide]], [[nitrogen dioxide]], or [[ozone]]) frequently show high [[correlations]], as they stem from the same chemical process(es). These correlations depend on space (i.e., location) and time (i.e., period). Therefore, a pollutant distribution is not necessarily representative for every location and every period. If a low-cost measurement instrument is calibrated with field data in a multivariate manner, more precisely by collocation next to a reference instrument, the relationships between the different compounds are incorporated into the calibration model. By relocation of the measurement instrument, erroneous results can be produced.<ref>{{cite journal | vauthors = Tancev G, Pascale C | title = The Relocation Problem of Field Calibrated Low-Cost Sensor Systems in Air Quality Monitoring: A Sampling Bias | journal = Sensors | volume = 20 | issue = 21 | pages = 6198 | date = October 2020 | pmid = 33143233 | pmc = 7662848 | doi = 10.3390/s20216198 | bibcode = 2020Senso..20.6198T | doi-access = free }}</ref> A twenty-first century example is the [[COVID-19 pandemic]], where variations in sampling bias in [[COVID-19 testing]] have been shown to account for wide variations in both [[Case fatality rate|case fatality rates]] and the [[age distribution]] of cases across countries.<ref>{{cite report | vauthors = Ward D | title = Sampling Bias: Explaining Wide Variations in COVID-19 Case Fatality Rates. | date = 20 April 2020| work = Preprint | location = Bern, Switzerland | doi = 10.13140/RG.2.2.24953.62564/1 }}</ref><ref>{{cite journal | vauthors = Böttcher L, [[Maria Rita D'Orsogna|D'Orsogna MR]], Chou T | title = Using excess deaths and testing statistics to determine COVID-19 mortalities. | journal = European Journal of Epidemiology | volume = 36 | pages = 545–558 | date = May 2021 | issue = 5 | doi = 10.1007/s10654-021-00748-2 | doi-access = free | pmid = 34002294 | pmc = 8127858 }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)