Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Statistical hypothesis test
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== NeymanâPearson hypothesis testing == An example of NeymanâPearson hypothesis testing (or null hypothesis statistical significance testing) can be made by a change to the radioactive suitcase example. If the "suitcase" is actually a shielded container for the transportation of radioactive material, then a test might be used to select among three hypotheses: no radioactive source present, one present, two (all) present. The test could be required for safety, with actions required in each case. The [[NeymanâPearson lemma]] of hypothesis testing says that a good criterion for the selection of hypotheses is the ratio of their probabilities (a [[likelihood-ratio test|likelihood ratio]]). A simple method of solution is to select the hypothesis with the highest probability for the Geiger counts observed. The typical result matches intuition: few counts imply no source, many counts imply two sources and intermediate counts imply one source. Notice also that usually there are problems for [[Philosophic burden of proof#Proving a negative|proving a negative]]. Null hypotheses should be at least [[Falsifiability|falsifiable]]. NeymanâPearson theory can accommodate both prior probabilities and the costs of actions resulting from decisions.<ref name="Ash">{{cite book | last = Ash | first = Robert | title = Basic probability theory | publisher = Wiley | location = New York | year = 1970 | isbn = 978-0471034506 }}Section 8.2</ref> The former allows each test to consider the results of earlier tests (unlike Fisher's significance tests). The latter allows the consideration of economic issues (for example) as well as probabilities. A likelihood ratio remains a good criterion for selecting among hypotheses. The two forms of hypothesis testing are based on different problem formulations. The original test is analogous to a true/false question; the NeymanâPearson test is more like multiple choice. In the view of [[John Tukey|Tukey]]<ref name="Tukey60" /> the former produces a conclusion on the basis of only strong evidence while the latter produces a decision on the basis of available evidence. While the two tests seem quite different both mathematically and philosophically, later developments lead to the opposite claim. Consider many tiny radioactive sources. The hypotheses become 0,1,2,3... grains of radioactive sand. There is little distinction between none or some radiation (Fisher) and 0 grains of radioactive sand versus all of the alternatives (NeymanâPearson). The major NeymanâPearson paper of 1933<ref name="Neyman 289â337" /> also considered composite hypotheses (ones whose distribution includes an unknown parameter). An example proved the optimality of the (Student's) ''t''-test, "there can be no better test for the hypothesis under consideration" (p 321). NeymanâPearson theory was proving the optimality of Fisherian methods from its inception. Fisher's significance testing has proven a popular flexible statistical tool in application with little mathematical growth potential. NeymanâPearson hypothesis testing is claimed as a pillar of mathematical statistics,<ref>{{cite journal | last = Stigler | first = Stephen M. | title = The History of Statistics in 1933 | journal = Statistical Science | volume = 11 | issue = 3 | pages = 244â252 | date = August 1996 | jstor=2246117 | doi=10.1214/ss/1032280216| doi-access = free}}</ref> creating a new paradigm for the field. It also stimulated new applications in [[statistical process control]], [[detection theory]], [[decision theory]] and [[game theory]]. Both formulations have been successful, but the successes have been of a different character. The dispute over formulations is unresolved. Science primarily uses Fisher's (slightly modified) formulation as taught in introductory statistics. Statisticians study NeymanâPearson theory in graduate school. Mathematicians are proud of uniting the formulations. Philosophers consider them separately. Learned opinions deem the formulations variously competitive (Fisher vs Neyman), incompatible<ref name="ftp.isds.duke" /> or complementary.<ref name="Lehmann93" /> The dispute has become more complex since Bayesian inference has achieved respectability. The terminology is inconsistent. Hypothesis testing can mean any mixture of two formulations that both changed with time. Any discussion of significance testing vs hypothesis testing is doubly vulnerable to confusion. Fisher thought that hypothesis testing was a useful strategy for performing industrial quality control, however, he strongly disagreed that hypothesis testing could be useful for scientists.<ref name="Fisher 1955 69â78"/> Hypothesis testing provides a means of finding test statistics used in significance testing.<ref name="Lehmann93" /> The concept of power is useful in explaining the consequences of adjusting the significance level and is heavily used in [[sample size determination]]. The two methods remain philosophically distinct.<ref name=Lenhard/> They usually (but ''not always'') produce the same mathematical answer. The preferred answer is context dependent.<ref name="Lehmann93">{{cite journal|last=Lehmann|first=E. L.|title=The Fisher, NeymanâPearson Theories of Testing Hypotheses: One Theory or Two?|journal=Journal of the American Statistical Association|volume=88|issue=424|pages=1242â1249|date=December 1993|doi=10.1080/01621459.1993.10476404}}</ref> While the existing merger of Fisher and NeymanâPearson theories has been heavily criticized, modifying the merger to achieve Bayesian goals has been considered.<ref>{{cite journal|last=Berger|first=James O.|title=Could Fisher, Jeffreys and Neyman Have Agreed on Testing?|journal=Statistical Science|volume=18|issue=1|pages=1â32|year=2003|doi=10.1214/ss/1056397485|doi-access=free}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)