Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Logistic regression
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====Deviance and likelihood ratio tests==== In linear regression analysis, one is concerned with partitioning variance via the [[Partition of sums of squares|sum of squares]] calculations β variance in the criterion is essentially divided into variance accounted for by the predictors and residual variance. In logistic regression analysis, [[Deviance (statistics)|deviance]] is used in lieu of a sum of squares calculations.<ref name=Cohen/> Deviance is analogous to the sum of squares calculations in linear regression<ref name=Hosmer/> and is a measure of the lack of fit to the data in a logistic regression model.<ref name=Cohen/> When a "saturated" model is available (a model with a theoretically perfect fit), deviance is calculated by comparing a given model with the saturated model.<ref name=Hosmer/> This computation gives the [[likelihood-ratio test]]:<ref name=Hosmer/> :<math> D = -2\ln \frac{\text{likelihood of the fitted model}} {\text{likelihood of the saturated model}}.</math> In the above equation, {{mvar|D}} represents the deviance and ln represents the natural logarithm. The log of this likelihood ratio (the ratio of the fitted model to the saturated model) will produce a negative value, hence the need for a negative sign. {{mvar|D}} can be shown to follow an approximate [[chi-squared distribution]].<ref name=Hosmer/> Smaller values indicate better fit as the fitted model deviates less from the saturated model. When assessed upon a chi-square distribution, nonsignificant chi-square values indicate very little unexplained variance and thus, good model fit. Conversely, a significant chi-square value indicates that a significant amount of the variance is unexplained. When the saturated model is not available (a common case), deviance is calculated simply as β2Β·(log likelihood of the fitted model), and the reference to the saturated model's log likelihood can be removed from all that follows without harm. Two measures of deviance are particularly important in logistic regression: null deviance and model deviance. The null deviance represents the difference between a model with only the intercept (which means "no predictors") and the saturated model. The model deviance represents the difference between a model with at least one predictor and the saturated model.<ref name=Cohen/> In this respect, the null model provides a baseline upon which to compare predictor models. Given that deviance is a measure of the difference between a given model and the saturated model, smaller values indicate better fit. Thus, to assess the contribution of a predictor or set of predictors, one can subtract the model deviance from the null deviance and assess the difference on a <math>\chi^2_{s-p},</math> chi-square distribution with [[Degrees of freedom (statistics)|degrees of freedom]]<ref name=Hosmer/> equal to the difference in the number of parameters estimated. Let :<math>\begin{align} D_{\text{null}} &=-2\ln \frac{\text{likelihood of null model}} {\text{likelihood of the saturated model}}\\[6pt] D_{\text{fitted}} &=-2\ln \frac{\text{likelihood of fitted model}} {\text{likelihood of the saturated model}}. \end{align} </math> Then the difference of both is: :<math>\begin{align} D_\text{null}- D_\text{fitted} &= -2 \left(\ln \frac{\text{likelihood of null model}} {\text{likelihood of the saturated model}}-\ln \frac{\text{likelihood of fitted model}} {\text{likelihood of the saturated model}}\right)\\[6pt] &= -2 \ln \frac{ \left( \dfrac{\text{likelihood of null model}}{\text{likelihood of the saturated model}}\right)}{\left(\dfrac{\text{likelihood of fitted model}}{\text{likelihood of the saturated model}}\right)}\\[6pt] &= -2 \ln \frac{\text{likelihood of the null model}}{\text{likelihood of fitted model}}. \end{align}</math> If the model deviance is significantly smaller than the null deviance then one can conclude that the predictor or set of predictors significantly improve the model's fit. This is analogous to the {{mvar|F}}-test used in linear regression analysis to assess the significance of prediction.<ref name=Cohen/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)