Editing One- and two-tailed tests (section)

== Applications ==
One-tailed tests are used for asymmetric distributions that have a single tail, such as the [[chi-squared distribution]], which are common in measuring [[goodness-of-fit]], or for one side of a distribution that has two tails, such as the [[normal distribution]], which is common in estimating location; this corresponds to specifying a direction. Two-tailed tests are only applicable when there are two tails, such as in the normal distribution, and correspond to considering either direction significant.<ref>{{cite journal |last1=Mundry |first1=R. |last2=Fischer |first2=J. |year=1998 |title=Use of Statistical Programs for Nonparametric Tests of Small Samples Often Leads to Incorrect P Values: Examples from Animal Behaviour |journal=Animal Behaviour |volume=56 |issue=1 |pages=256–259 |doi=10.1006/anbe.1998.0756 |pmid=9710485 |s2cid=40169869 }}</ref><ref>{{cite journal |last=Pillemer |first=D. B. |year=1991 |title=One-versus two-tailed hypothesis tests in contemporary educational research |journal=Educational Researcher |volume=20 |issue=9 |pages=13–17 |doi=10.3102/0013189X020009013 |s2cid=145478007 }}</ref>

In the approach of [[Ronald Fisher]], the [[null hypothesis]] H<sub>0</sub> will be rejected when the [[p-value|''p''-value]] of the [[test statistic]] is sufficiently extreme (vis-a-vis the test statistic's [[sampling distribution]]) and thus judged unlikely to be the result of chance. This is usually done by comparing the resulting p-value with the specified significance level, denoted by <math>\alpha</math>, when computing the statistical significance of a parameter''.''  In a one-tailed test, "extreme" is decided beforehand as either meaning "sufficiently small" ''or'' meaning "sufficiently large" – values in the other direction are considered not significant. One may report that the left or right tail probability as the one-tailed p-value, which ultimately corresponds to the direction in which the test statistic deviates from H<sub>0.</sub><ref>{{Cite book|title=A modern introduction to probability and statistics : understanding why and how|url=https://archive.org/details/modernintroducti00dekk_431|url-access=limited|date=2005|publisher=Springer|others=Dekking, Michel, 1946-|isbn=9781852338961|location=London|pages=[https://archive.org/details/modernintroducti00dekk_431/page/n392 389]–390|oclc=262680588}}</ref> In a two-tailed test, "extreme" means "either sufficiently small or sufficiently large", and values in either direction are considered significant.<ref>[[John E. Freund]], (1984) ''Modern Elementary Statistics'', sixth edition. Prentice hall. {{ISBN|0-13-593525-3}} (Section "Inferences about Means", chapter "Significance Tests", page 289.)</ref> For a given test statistic, there is a single two-tailed test, and two one-tailed tests, one each for either direction. When provided a significance level <math>\alpha</math>, the critical regions would exist on the two tail ends of the distribution with an area of <math>\alpha/2</math> each for a two-tailed test. Alternatively, the critical region would solely exist on the single tail end with an area of <math>\alpha</math> for a one-tailed test. For a given significance level in a two-tailed test for a test statistic, the corresponding one-tailed tests for the same test statistic will be considered either twice as significant (half the ''p''-value) if the data is in the direction specified by the test, or not significant at all (''p''-value above <math>\alpha</math>) if the data is in the direction opposite of the critical region specified by the test. 

For example, if [[#Coin flipping example|flipping a coin]], testing whether it is biased ''towards'' heads is a one-tailed test, and getting data of "all heads" would be seen as highly significant, while getting data of "all tails" would be not significant at all (''p''&nbsp;=&nbsp;1). By contrast, testing whether it is biased in ''either'' direction is a two-tailed test, and either "all heads" or "all tails" would both be seen as highly significant data. In medical testing, while one is generally interested in whether a treatment results in outcomes that are ''better'' than chance, thus suggesting a one-tailed test; a ''worse'' outcome is also interesting for the scientific field, therefore one should use a two-tailed test that corresponds instead to testing whether the treatment results in outcomes that are ''different'' from chance, either better or worse.<ref>J M Bland, D G Bland (BMJ, 1994) ''Statistics Notes: One and two sided tests of significance''</ref> In the archetypal [[lady tasting tea]] experiment, Fisher tested whether the lady in question was ''better'' than chance at distinguishing two types of tea preparation, not whether her ability was ''different'' from chance, and thus he used a one-tailed test.