Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Gauss–Newton algorithm
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Derivation from Newton's method == In what follows, the Gauss–Newton algorithm will be derived from [[Newton's method in optimization|Newton's method]] for function optimization via an approximation. As a consequence, the rate of convergence of the Gauss–Newton algorithm can be quadratic under certain regularity conditions. In general (under weaker conditions), the convergence rate is linear.<ref>{{Cite web |url=http://www.henley.ac.uk/web/FILES/maths/09-04.pdf |title=Archived copy |access-date=2014-04-25 |archive-url=https://web.archive.org/web/20160804022151/http://www.henley.ac.uk/web/FILES/maths/09-04.pdf |archive-date=2016-08-04 |url-status=dead }}</ref> The recurrence relation for Newton's method for minimizing a function ''S'' of parameters <math>\boldsymbol\beta</math> is <math display="block"> \boldsymbol\beta^{(s+1)} = \boldsymbol\beta^{(s)} - \mathbf H^{-1} \mathbf g,</math> where '''g''' denotes the [[gradient|gradient vector]] of ''S'', and '''H''' denotes the [[Hessian matrix]] of ''S''. Since <math display="inline">S = \sum_{i=1}^m r_i^2</math>, the gradient is given by <math display="block">g_j = 2 \sum_{i=1}^m r_i \frac{\partial r_i}{\partial \beta_j}.</math> Elements of the Hessian are calculated by differentiating the gradient elements, <math>g_j</math>, with respect to <math>\beta_k</math>: <math display="block">H_{jk} = 2 \sum_{i=1}^m \left(\frac{\partial r_i}{\partial \beta_j} \frac{\partial r_i}{\partial \beta_k} + r_i \frac{\partial^2 r_i}{\partial \beta_j \partial \beta_k}\right).</math> The Gauss–Newton method is obtained by ignoring the second-order derivative terms (the second term in this expression). That is, the Hessian is approximated by <math display="block">H_{jk} \approx 2 \sum_{i=1}^m J_{ij} J_{ik},</math> where <math display="inline">J_{ij} = {\partial r_i}/{\partial \beta_j}</math> are entries of the Jacobian '''J<sub>r</sub>'''. Note that when the exact hessian is evaluated near an exact fit we have near-zero <math>r_i</math>, so the second term becomes near-zero as well, which justifies the approximation. The gradient and the approximate Hessian can be written in matrix notation as <math display="block">\mathbf{g} = 2 {\mathbf{J}_\mathbf{r}}^\operatorname{T} \mathbf{r}, \quad \mathbf{H} \approx 2 {\mathbf{J}_\mathbf{r}}^\operatorname{T} \mathbf{J_r}.</math> These expressions are substituted into the recurrence relation above to obtain the operational equations <math display="block"> \boldsymbol{\beta}^{(s+1)} = \boldsymbol\beta^{(s)} + \Delta; \quad \Delta = -\left(\mathbf{J_r}^\operatorname{T} \mathbf{J_r}\right)^{-1} \mathbf{J_r}^\operatorname{T} \mathbf{r}.</math> Convergence of the Gauss–Newton method is not guaranteed in all instances. The approximation <math display="block">\left|r_i \frac{\partial^2 r_i}{\partial \beta_j \partial \beta_k}\right| \ll \left|\frac{\partial r_i}{\partial \beta_j} \frac{\partial r_i}{\partial \beta_k}\right|</math> that needs to hold to be able to ignore the second-order derivative terms may be valid in two cases, for which convergence is to be expected:<ref>Nocedal (1999), p. 259.</ref> # The function values <math>r_i</math> are small in magnitude, at least around the minimum. # The functions are only "mildly" nonlinear, so that <math display="inline">\frac{\partial^2 r_i}{\partial \beta_j \partial \beta_k}</math> is relatively small in magnitude.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)