Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
F-test
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Regression problems=== {{further|Stepwise regression}} Consider two models, 1 and 2, where model 1 is 'nested' within model 2. Model 1 is the restricted model, and model 2 is the unrestricted one. That is, model 1 has ''p''<sub>1</sub> parameters, and model 2 has ''p''<sub>2</sub> parameters, where ''p''<sub>1</sub> < ''p''<sub>2</sub>, and for any choice of parameters in model 1, the same regression curve can be achieved by some choice of the parameters of model 2. One common context in this regard is that of deciding whether a model fits the data significantly better than does a naive model, in which the only explanatory term is the intercept term, so that all predicted values for the dependent variable are set equal to that variable's sample mean. The naive model is the restricted model, since the coefficients of all potential explanatory variables are restricted to equal zero. Another common context is deciding whether there is a structural break in the data: here the restricted model uses all data in one regression, while the unrestricted model uses separate regressions for two different subsets of the data. This use of the F-test is known as the [[Chow test]]. The model with more parameters will always be able to fit the data at least as well as the model with fewer parameters. Thus typically model 2 will give a better (i.e. lower error) fit to the data than model 1. But one often wants to determine whether model 2 gives a ''significantly'' better fit to the data. One approach to this problem is to use an ''F''-test. If there are ''n'' data points to estimate parameters of both models from, then one can calculate the ''F'' statistic, given by :<math>F=\frac{\left(\frac{\text{RSS}_1 - \text{RSS}_2 }{p_2 - p_1}\right)}{\left(\frac{\text{RSS}_2}{n - p_2}\right)} = \frac{\text{RSS}_1 - \text{RSS}_2 }{\text{RSS}_2} \cdot \frac{n - p_2}{p_2 - p_1},</math> where RSS<sub>''i''</sub> is the [[residual sum of squares]] of model ''i''. If the regression model has been calculated with weights, then replace RSS<sub>''i''</sub> with Ο<sup>2</sup>, the weighted sum of squared residuals. Under the null hypothesis that model 2 does not provide a significantly better fit than model 1, ''F'' will have an ''F'' distribution, with (''p''<sub>2</sub>β''p''<sub>1</sub>, ''n''β''p''<sub>2</sub>) [[Degrees of freedom (statistics)|degrees of freedom]]. The null hypothesis is rejected if the ''F'' calculated from the data is greater than the critical value of the [[F-distribution|''F''-distribution]] for some desired false-rejection probability (e.g. 0.05). Since ''F'' is a monotone function of the likelihood ratio statistic, the ''F''-test is a [[likelihood ratio test]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)