Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Coefficient of determination
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== As explained variance === A larger value of ''R''<sup>2</sup> implies a more successful regression model.<ref name=Devore/>{{rp|463}} Suppose {{nowrap|1=''R''<sup>2</sup> = 0.49}}. This implies that 49% of the variability of the dependent variable in the data set has been accounted for, and the remaining 51% of the variability is still unaccounted for. For regression models, the regression sum of squares, also called the [[explained sum of squares]], is defined as : <math>SS_\text{reg}=\sum_i (f_i -\bar{y})^2</math> In some cases, as in [[simple linear regression]], the [[total sum of squares]] equals the sum of the two other sums of squares defined above: : <math>SS_\text{res}+SS_\text{reg}=SS_\text{tot}</math> See [[Explained sum of squares#Partitioning in the general ordinary least squares model|Partitioning in the general OLS model]] for a derivation of this result for one case where the relation holds. When this relation does hold, the above definition of ''R''<sup>2</sup> is equivalent to : <math>R^2 = \frac{SS_\text{reg}}{SS_\text{tot}} = \frac{SS_\text{reg}/n}{SS_\text{tot}/n}</math> where ''n'' is the number of observations (cases) on the variables. In this form ''R''<sup>2</sup> is expressed as the ratio of the [[explained variation|explained variance]] (variance of the model's predictions, which is {{nowrap|''SS''<sub>reg</sub> / ''n''}}) to the total variance (sample variance of the dependent variable, which is {{nowrap|''SS''<sub>tot</sub> / ''n''}}). This partition of the sum of squares holds for instance when the model values ''Ζ''<sub>''i''</sub> have been obtained by [[linear regression]]. A milder [[sufficient condition]] reads as follows: The model has the form : <math>f_i=\widehat\alpha+\widehat\beta q_i</math> where the ''q''<sub>''i''</sub> are arbitrary values that may or may not depend on ''i'' or on other free parameters (the common choice ''q''<sub>''i''</sub> = ''x''<sub>''i''</sub> is just one special case), and the coefficient estimates <math>\widehat\alpha</math> and <math>\widehat\beta</math> are obtained by minimizing the residual sum of squares. This set of conditions is an important one and it has a number of implications for the properties of the fitted [[Errors and residuals in statistics|residuals]] and the modelled values. In particular, under these conditions: : <math>\bar{f}=\bar{y}.\,</math>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)