Editing Analysis of variance (section)

===The ''F''-test===
{{Main|F-test}}
[[File:F-Distribution Table.png|thumb|338x338px|To check for statistical significance of a one-way ANOVA, we consult the F-probability table using degrees of freedom at the {{Math|0.05}} alpha level. After computing the F-statistic, we compare the value at the intersection of each degrees of freedom, also known as the critical value. If one's F-statistic is greater in magnitude than their critical value, we can say there is statistical significance at the {{Math|0.05}} alpha level.]]
The [[F-test|''F''-test]] is used for comparing the factors of the total deviation. For example, in one-way, or single-factor ANOVA, statistical significance is tested for by comparing the F test statistic

<math display="block">F = \frac{\text{variance between treatments}}{\text{variance within treatments}}</math>
<math display="block">F = \frac{MS_\text{Treatments}}{MS_\text{Error}} = {{SS_\text{Treatments} / (I-1)} \over {SS_\text{Error} / (n_T-I)}}</math>

where ''MS'' is mean square, <math>I</math> is the number of treatments and <math>n_T</math> is the total number of cases to the [[F-distribution|''F''-distribution]] with <math>I - 1</math> being the numerator degrees of freedom and <math>n_T - I</math> the denominator degrees of freedom. Using the ''F''-distribution is a natural candidate because the test statistic is the ratio of two scaled sums of squares each of which follows a scaled [[chi-squared distribution]].

The expected value of F is <math>1 + {n \sigma^2_\text{Treatment}} / {\sigma^2_\text{Error}}</math> (where <math>n</math> is the treatment sample size) which is 1 for no treatment effect. As values of F increase above 1, the evidence is increasingly inconsistent with the null hypothesis. Two apparent experimental methods of increasing F are increasing the sample size and reducing the error variance by tight experimental controls.

There are two methods of concluding the ANOVA hypothesis test, both of which produce the same result:
* The textbook method is to compare the observed value of F with the critical value of F determined from tables. The critical value of F is a function of the degrees of freedom of the numerator and the denominator and the significance level (''α''). If F ≥ F<sub>Critical</sub>, the null hypothesis is rejected.
* The computer method calculates the probability (p-value) of a value of F greater than or equal to the observed value. The null hypothesis is rejected if this probability is less than or equal to the significance level (''α'').
The ANOVA ''F''-test is known to be nearly optimal in the sense of minimizing false negative errors for a fixed rate of false positive errors (i.e. maximizing power for a fixed significance level). For example, to test the hypothesis that various medical treatments have exactly the same effect, the [[F-test|''F''-test]]'s ''p''-values closely approximate the [[permutation test]]'s [[p-value]]s: The approximation is particularly close when the design is balanced.<ref name="HinkelmannKempthorne" /><ref>Hinkelmann and Kempthorne (2008, Volume 1, Section 6.7: Completely randomized design; CRD with unequal numbers of replications)</ref> Such [[permutation test]]s characterize [[uniformly most powerful test|tests with maximum power]] against all [[alternative hypothesis|alternative hypotheses]], as observed by [[Paul R. Rosenbaum|Rosenbaum]].<ref group="nb">Rosenbaum (2002, page 40) cites Section 5.7 (Permutation Tests), Theorem 2.3 (actually Theorem 3, page 184) of [[Erich Leo Lehmann|Lehmann]]'s ''Testing Statistical Hypotheses'' (1959).</ref> The ANOVA ''F''-test (of the null-hypothesis that all treatments have exactly the same effect) is recommended as a practical test, because of its robustness against many alternative distributions.<ref>Moore and McCabe (2003, page 763)</ref><ref group="nb">The ''F''-test for the comparison of variances has a mixed reputation. It 
is not recommended as a hypothesis test to determine whether two ''different'' samples have the same variance. It is recommended for ANOVA where two estimates of the variance of the ''same'' sample are compared. While the ''F''-test is not generally robust against departures from normality, it has been found to be robust in the special case of ANOVA. Citations from Moore & McCabe (2003): "Analysis of variance uses F statistics, but these are not the same as the F statistic for comparing two population standard deviations." (page 554) "The F test and other procedures for inference about variances are so lacking in robustness as to be of little use in practice." (page 556) "[The ANOVA ''F''-test] is relatively insensitive to moderate nonnormality and unequal variances, especially when the sample sizes are similar." (page 763) ANOVA assumes homoscedasticity, but it is robust. The statistical test for homoscedasticity (the ''F''-test) is not robust. Moore & McCabe recommend a rule of thumb.</ref>