Editing Statistical significance (section)

==History==
{{Main|History of statistics}}
Statistical significance dates to the 18th century, in the work of [[John Arbuthnot]] and [[Pierre-Simon Laplace]], who computed the [[p-value|''p''-value]] for the [[human sex ratio]] at birth, assuming a null hypothesis of equal probability of male and female births; see {{slink|p-value|History|display=''p''-value}} for details.<ref>{{cite book |title=The Descent of Human Sex Ratio at Birth |first1=Éric |last1=Brian |first2=Marie |last2=Jaisson |chapter=Physico-Theology and Mathematics (1710–1794) |pages=1–25 |year=2007 |publisher=Springer Science & Business Media |isbn=978-1-4020-6036-6}}</ref><ref>{{cite journal|author=John Arbuthnot |title=An argument for Divine Providence, taken from the constant regularity observed in the births of both sexes|journal=[[Philosophical Transactions of the Royal Society of London]] | volume=27| pages=186–190 | year=1710 | url=http://www.york.ac.uk/depts/maths/histstat/arbuthnot.pdf|doi=10.1098/rstl.1710.0011|issue=325–336|doi-access=free}}</ref><ref name="Conover1999">{{Citation
|last=Conover
|first=W.J.
|title=Practical Nonparametric Statistics
|edition=Third
|year=1999
|publisher=Wiley
|isbn=978-0-471-16068-7
|pages=157–176
|chapter=Chapter 3.4: The Sign Test
}}</ref><ref name="Sprent1989">{{Citation
|last=Sprent
|first=P.
|title=Applied Nonparametric Statistical Methods
|edition=Second
|year=1989
|publisher=Chapman & Hall
|isbn=978-0-412-44980-2
}}</ref><ref>{{cite book |title=The History of Statistics: The Measurement of Uncertainty Before 1900 |first=Stephen M. |last=Stigler |publisher=Harvard University Press |year=1986 |isbn=978-0-674-40341-3 |pages=[https://archive.org/details/historyofstatist00stig/page/225 225–226]}}</ref><ref name="Bellhouse2001">{{Citation
|last=Bellhouse
|first=David
|title=in Statisticians of the Centuries 
|editor1-link=Chris Heyde
|editor1=C.C. Heyde 
|editor2-link=Eugene Seneta
|editor2=E. Seneta
|year=2001
|publisher=Springer
|isbn=978-0-387-95329-8
|pages=39–42
|chapter=John Arbuthnot}}
</ref><ref name="Hald1998">{{Citation
|last=Hald
|first=Anders
|title=A History of Mathematical Statistics from 1750 to 1930
|year=1998
|publisher=Wiley
|pages=65
|chapter=Chapter 4. Chance or Design: Tests of Significance}}
</ref>

In 1925, [[Ronald Fisher]] advanced the idea of statistical hypothesis testing, which he called "tests of significance", in his publication ''[[Statistical Methods for Research Workers]]''.<ref name="Cumming">{{cite book|title=Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis|publisher=Routledge|year=2011|isbn=978-0-415-87968-2|series=Multivariate Applications Series|location=East Sussex, United Kingdom|pages=21–52|chapter=From null hypothesis significance to testing effect sizes|last1=Cumming|first1=Geoff}}</ref><ref name="Fisher1925">{{cite book|title=Statistical Methods for Research Workers|publisher=Oliver and Boyd|year=1925|location=Edinburgh, UK|pages=[https://archive.org/details/statisticalmethoe7fish/page/43 43]|last1=Fisher|first1=Ronald A.|isbn=978-0-05-002170-5|url=https://archive.org/details/statisticalmethoe7fish/page/43}}</ref><ref name="Poletiek">{{cite book|title=Hypothesis-testing Behaviour|publisher=Psychology Press|year=2001|isbn=978-1-84169-159-6|edition=1st|series=Essays in Cognitive Psychology|location=East Sussex, United Kingdom|pages=29–48|chapter=Formal theories of testing|last1=Poletiek|first1=Fenna H.}}</ref> Fisher suggested a probability of one in twenty (0.05) as a convenient cutoff level to reject the null hypothesis.<ref name=Quinn>{{cite book |last1 = Quinn |first1 = Geoffrey R. |last2 = Keough |first2 = Michael J. |title = Experimental Design and Data Analysis for Biologists |edition = 1st |publisher = Cambridge University Press |location = Cambridge, UK |year = 2002 |isbn = 978-0-521-00976-8 |pages = [https://archive.org/details/experimentaldesi0000quin/page/46 46–69] |url = https://archive.org/details/experimentaldesi0000quin/page/46 }}</ref> In a 1933 paper, [[Jerzy Neyman]] and [[Egon Pearson]] called this cutoff the ''significance level'', which they named <math>\alpha</math>. They recommended that <math>\alpha</math> be set ahead of time, prior to any data collection.<ref name=Quinn /><ref name="Neyman">{{Cite journal|last2=Pearson|first2=E. S.|year=1933|title=The testing of statistical hypotheses in relation to probabilities a priori|journal=[[Mathematical Proceedings of the Cambridge Philosophical Society]]|volume=29|issue=4|pages=492–510|doi=10.1017/S030500410001152X|last1=Neyman|first1=J.|bibcode=1933PCPS...29..492N |s2cid=119855116 }}</ref>

Despite his initial suggestion of 0.05 as a significance level, Fisher did not intend this cutoff value to be fixed. In his 1956 publication ''Statistical Methods and Scientific Inference,'' he recommended that significance levels be set according to specific circumstances.<ref name=Quinn />

===Related concepts===
The significance level <math>\alpha</math> is the threshold for <math>p</math> below which the null hypothesis is rejected even though by assumption it were true, and something else is going on. This means that <math>\alpha</math> is also the probability of mistakenly rejecting the null hypothesis, if the null hypothesis is true.<ref name="Dalgaard" /> This is also called [[False positives and false negatives#False positive error|false positive]] and [[Type I and type II errors#Type I error|type I error]].

Sometimes researchers talk about the [[confidence level]] {{math|''γ'' {{=}} (1 − ''α'')}} instead. This is the probability of not rejecting the null hypothesis given that it is true.<ref>"Conclusions about statistical significance are possible with the help of the confidence interval. If the confidence interval does not include the value of zero effect, it can be assumed that there is a statistically significant result." {{cite journal|title=Confidence Interval or P-Value?|journal=Deutsches Ärzteblatt Online|volume=106|issue=19|pages=335–9|doi=10.3238/arztebl.2009.0335|pmid=19547734|pmc=2689604|year=2009|last1=Prel|first1=Jean-Baptist du|last2=Hommel|first2=Gerhard|last3=Röhrig|first3=Bernd|last4=Blettner|first4=Maria}}</ref><ref>[https://www.cscu.cornell.edu/news/statnews/stnews73.pdf StatNews #73: Overlapping Confidence Intervals and Statistical Significance]</ref> Confidence levels and confidence intervals were introduced by Neyman in 1937.<ref name="Neyman1937">{{cite journal|year=1937|title=Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability|jstor=91337|journal=[[Philosophical Transactions of the Royal Society A]]|volume=236|issue=767|pages=333–380|doi=10.1098/rsta.1937.0005|last1=Neyman|first1=J.|bibcode=1937RSPTA.236..333N |author-link=Jerzy Neyman|doi-access=|s2cid=19584450 }}</ref>