Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Mean squared error
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Estimator=== The MSE of an estimator <math>\hat{\theta}</math> with respect to an unknown parameter <math>\theta</math> is defined as<ref name=":1" /> :<math>\operatorname{MSE}(\hat{\theta})=\operatorname{E}_{\theta}\left[(\hat{\theta}-\theta)^2\right].</math> This definition depends on the unknown parameter, therefore the MSE is a ''priori property'' of an estimator. The MSE could be a function of unknown parameters, in which case any ''estimator'' of the MSE based on estimates of these parameters would be a function of the data (and thus a random variable). If the estimator <math>\hat{\theta}</math> is derived as a sample statistic and is used to estimate some population parameter, then the expectation is with respect to the [[sampling distribution]] of the sample statistic. The MSE can be written as the sum of the [[variance]] of the estimator and the squared [[Bias_of_an_estimator|bias]] of the estimator, providing a useful way to calculate the MSE and implying that in the case of unbiased estimators, the MSE and variance are equivalent.<ref name="wackerly">{{cite book |first1=Dennis |last1=Wackerly |first2=William|last2=Mendenhall |first3=Richard L.|last3=Scheaffer |title=Mathematical Statistics with Applications |publisher=Thomson Higher Education|location=Belmont, CA, USA |year=2008 |edition=7 |isbn=978-0-495-38508-0}}</ref> :<math>\operatorname{MSE}(\hat{\theta})=\operatorname{Var}_{\theta}(\hat{\theta})+ \operatorname{Bias}(\hat{\theta},\theta)^2.</math> ====Proof of variance and bias relationship==== <math>\begin{align} \operatorname{MSE}(\hat{\theta}) &= \operatorname{E}_\theta \left [(\hat{\theta}-\theta)^2 \right ] \\ &= \operatorname{E}_\theta\left[\left(\hat{\theta}-\operatorname{E}_\theta [\hat\theta]+\operatorname{E}_\theta[\hat\theta]-\theta\right)^2\right]\\ &= \operatorname{E}_\theta\left[\left(\hat{\theta}-\operatorname{E}_\theta[\hat\theta]\right)^2 +2\left (\hat{\theta}-\operatorname{E}_\theta[\hat\theta] \right ) \left (\operatorname{E}_\theta[\hat\theta]-\theta \right )+\left( \operatorname{E}_\theta[\hat\theta]-\theta \right)^2\right] \\ &= \operatorname{E}_\theta\left[\left(\hat{\theta}-\operatorname{E}_\theta[\hat\theta]\right)^2\right]+\operatorname{E}_\theta\left[2 \left (\hat{\theta}-\operatorname{E}_\theta[\hat\theta] \right ) \left (\operatorname{E}_\theta[\hat\theta]-\theta \right ) \right] + \operatorname{E}_\theta\left [ \left(\operatorname{E}_\theta[\hat\theta]-\theta\right)^2 \right] \\ &= \operatorname{E}_\theta\left[\left(\hat{\theta}-\operatorname{E}_\theta[\hat\theta]\right)^2\right]+ 2 \left(\operatorname{E}_\theta[\hat\theta]-\theta\right) \operatorname{E}_\theta\left[\hat{\theta}-\operatorname{E}_\theta[\hat\theta] \right] + \left(\operatorname{E}_\theta[\hat\theta]-\theta\right)^2 && \operatorname{E}_\theta[\hat\theta]-\theta = \text{constant} \\ &= \operatorname{E}_\theta\left[\left(\hat{\theta}-\operatorname{E}_\theta[\hat\theta]\right)^2\right]+ 2 \left(\operatorname{E}_\theta [\hat\theta]-\theta\right) \left ( \operatorname{E}_\theta[\hat{\theta}]-\operatorname{E}_\theta[\hat\theta] \right )+ \left(\operatorname{E}_\theta[\hat\theta]-\theta\right)^2 && \operatorname{E}_\theta[\hat\theta] = \text{constant} \\ &= \operatorname{E}_\theta\left[\left(\hat\theta-\operatorname{E}_\theta[\hat\theta]\right)^2\right]+\left(\operatorname{E}_\theta [\hat\theta]-\theta\right)^2\\ &= \operatorname{Var}_\theta(\hat\theta)+ \operatorname{Bias}_\theta(\hat\theta,\theta)^2 \end{align}</math> An even shorter proof can be achieved using the well-known formula that for a random variable <math display="inline">X</math>, <math display="inline">\mathbb{E}(X^2) = \operatorname{Var}(X) + (\mathbb{E}(X))^2</math>. By substituting <math display="inline">X</math> with, <math display="inline">\hat\theta-\theta</math>, we have :<math display="block">\begin{aligned} \operatorname{MSE}(\hat{\theta}) &= \mathbb{E}[(\hat\theta-\theta)^2] \\ &= \operatorname{Var}(\hat{\theta} - \theta) + (\mathbb{E}[\hat\theta - \theta])^2 \\ &= \operatorname{Var}(\hat\theta) + \operatorname{Bias}^2(\hat\theta,\theta) \end{aligned}</math> But in real modeling case, MSE could be described as the addition of model variance, model bias, and irreducible uncertainty (see [[Bias–variance tradeoff]]). According to the relationship, the MSE of the estimators could be simply used for the [[Efficiency (statistics)|efficiency]] comparison, which includes the information of estimator variance and bias. This is called MSE criterion.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)