Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Gauss–Markov theorem
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Remark === Proof that the OLS indeed ''minimizes'' the sum of squares of residuals may proceed as follows with a calculation of the [[Hessian matrix]] and showing that it is positive definite. The MSE function we want to minimize is <math display="block">f(\beta_0,\beta_1,\dots,\beta_p) = \sum_{i=1}^n (y_i-\beta_0-\beta_1x_{i1}-\dots-\beta_px_{ip})^2</math> for a multiple regression model with ''p'' variables. The first derivative is <math display="block">\begin{aligned} \frac{d}{d\boldsymbol{\beta}}f &= -2X^\operatorname{T} \left(\mathbf{y}-X\boldsymbol{\beta}\right)\\ &=-2\begin{bmatrix} \sum_{i=1}^{n} (y_i - \dots - \beta_px_{ip})\\ \sum_{i=1}^nx_{i1} (y_i-\dots-\beta_px_{ip})\\ \vdots\\ \sum_{i=1}^nx_{ip} (y_i-\dots-\beta_px_{ip}) \end{bmatrix}\\ &= \mathbf{0}_{p+1}, \end{aligned}</math> where <math>X^\operatorname{T}</math> is the design matrix <math display="block">X=\begin{bmatrix} 1 & x_{11} & \cdots & x_{1p}\\ 1 & x_{21} & \cdots & x_{2p}\\ &&\vdots\\ 1 & x_{n1} & \cdots & x_{np} \end{bmatrix}\in \R^{n\times(p+1)}; \qquad n\geq p+1</math> The [[Hessian matrix]] of second derivatives is <math display="block">\mathcal{H} = 2\begin{bmatrix} n & \sum_{i=1}^n x_{i1} & \cdots & \sum_{i=1}^n x_{ip} \\ \sum_{i=1}^n x_{i1}& \sum_{i=1}^n x_{i1}^2 & \cdots & \sum_{i=1}^nx_{i1}x_{ip}\\ \vdots & \vdots &\ddots & \vdots \\ \sum_{i=1}^n x_{ip} & \sum_{i=1}^n x_{ip}x_{i1}& \cdots & \sum_{i=1}^n x_{ip}^2 \end{bmatrix} = 2X^\operatorname{T}X</math> Assuming the columns of <math>X</math> are linearly independent so that <math>X^\operatorname{T} X</math> is invertible, let <math>X=\begin{bmatrix}\mathbf{v_1}& \mathbf{v_2}& \cdots & \mathbf{v}_{p+1}\end{bmatrix}</math>, then <math display="block">k_1\mathbf{v_1} + \dots + k_{p+1} \mathbf{v}_{p+1} = \mathbf 0\iff k_1= \dots =k_{p+1}=0</math> Now let <math>\mathbf{k} = (k_1,\dots,k_{p+1})^T \in \R^{(p+1)\times 1}</math> be an eigenvector of <math>\mathcal{H}</math>. <math display="block">\mathbf{k} \ne \mathbf{0} \implies \left(k_1\mathbf{v_1}+\dots+k_{p+1}\mathbf{v}_{p+1}\right)^2 > 0</math> In terms of vector multiplication, this means <math display="block">\begin{bmatrix} k_1 & \cdots & k_{p+1} \end{bmatrix} \begin{bmatrix}\mathbf{v_1} \\ \vdots \\ \mathbf{v}_{p+1}\end{bmatrix} \begin{bmatrix}\mathbf{v_1} & \cdots & \mathbf{v}_{p+1}\end{bmatrix} \begin{bmatrix}k_1 \\ \vdots\\ k_{p+1}\end{bmatrix} = \mathbf{k}^\operatorname{T}\mathcal{H}\mathbf{k} = \lambda \mathbf{k}^\operatorname{T}\mathbf{k}>0</math> where <math>\lambda</math> is the eigenvalue corresponding to <math>\mathbf{k}</math>. Moreover, <math display="block">\mathbf{k}^\operatorname{T}\mathbf{k} = \sum_{i=1}^{p+1}k_i^2 > 0 \implies \lambda > 0</math> Finally, as eigenvector <math>\mathbf{k}</math> was arbitrary, it means all eigenvalues of <math>\mathcal{H}</math> are positive, therefore <math>\mathcal{H}</math> is positive definite. Thus, <math display="block">\boldsymbol{\beta} = \left(X^\operatorname{T}X\right)^{-1}X^\operatorname{T}Y</math> is indeed a global minimum. Or, just see that for all vectors <math>\mathbf{v}, \mathbf{v}^\operatorname{T} X^\operatorname{T} X \mathbf{v} = \|\mathbf{X}\mathbf{v}\|^2 \ge 0 </math>. So the Hessian is positive definite if full rank.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)