Editing Logistic regression (section)

====Deviance and likelihood ratio tests====
In linear regression analysis, one is concerned with partitioning variance via the [[Partition of sums of squares|sum of squares]] calculations – variance in the criterion is essentially divided into variance accounted for by the predictors and residual variance. In logistic regression analysis, [[Deviance (statistics)|deviance]] is used in lieu of a sum of squares calculations.<ref name=Cohen/> Deviance is analogous to the sum of squares calculations in linear regression<ref name=Hosmer/>  and is a measure of the lack of fit to the data in a logistic regression model.<ref name=Cohen/> When a "saturated" model is available (a model with a theoretically perfect fit), deviance is calculated by comparing a given model with the saturated model.<ref name=Hosmer/>  This computation gives the [[likelihood-ratio test]]:<ref name=Hosmer/>

:<math> D = -2\ln \frac{\text{likelihood of the fitted model}} {\text{likelihood of the saturated model}}.</math>

In the above equation, {{mvar|D}} represents the deviance and ln represents the natural logarithm. The log of this likelihood ratio (the ratio of the fitted model to the saturated model) will produce a negative value, hence the need for a negative sign. {{mvar|D}} can be shown to follow an approximate [[chi-squared distribution]].<ref name=Hosmer/>  Smaller values indicate better fit as the fitted model deviates less from the saturated model. When assessed upon a chi-square distribution, nonsignificant chi-square values indicate very little unexplained variance and thus, good model fit. Conversely, a significant chi-square value indicates that a significant amount of the variance is unexplained.

When the saturated model is not available (a common case), deviance is calculated simply as −2·(log likelihood of the fitted model), and the reference to the saturated model's log likelihood can be removed from all that follows without harm.

Two measures of deviance are particularly important in logistic regression: null deviance and model deviance. The null deviance represents the difference between a model with only the intercept (which means "no predictors") and the saturated model. The model deviance represents the difference between a model with at least one predictor and the saturated model.<ref name=Cohen/> In this respect, the null model provides a baseline upon which to compare predictor models. Given that deviance is a measure of the difference between a given model and the saturated model, smaller values indicate better fit. Thus, to assess the contribution of a predictor or set of predictors, one can subtract the model deviance from the null deviance and assess the difference on a <math>\chi^2_{s-p},</math>  chi-square distribution with [[Degrees of freedom (statistics)|degrees of freedom]]<ref name=Hosmer/> equal to the difference in the number of parameters estimated.

Let

:<math>\begin{align}
    D_{\text{null}} &=-2\ln \frac{\text{likelihood of null model}} {\text{likelihood of the saturated model}}\\[6pt]
   D_{\text{fitted}} &=-2\ln \frac{\text{likelihood of fitted model}} {\text{likelihood of the saturated model}}.
\end{align}
</math>

Then the difference of both is:

:<math>\begin{align} 
D_\text{null}- D_\text{fitted} &= -2 \left(\ln \frac{\text{likelihood of null model}} {\text{likelihood of the saturated model}}-\ln \frac{\text{likelihood of fitted model}} {\text{likelihood of the saturated model}}\right)\\[6pt]
&= -2 \ln \frac{ \left( \dfrac{\text{likelihood of null model}}{\text{likelihood of the saturated model}}\right)}{\left(\dfrac{\text{likelihood of fitted model}}{\text{likelihood of the saturated model}}\right)}\\[6pt]
&= -2 \ln \frac{\text{likelihood of the null model}}{\text{likelihood of fitted model}}.
\end{align}</math>

If the model deviance is significantly smaller than the null deviance then one can conclude that the predictor or set of predictors significantly improve the model's fit. This is analogous to the {{mvar|F}}-test used in linear regression analysis to assess the significance of prediction.<ref name=Cohen/>