Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Gradient descent
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Solution of a non-linear system== Gradient descent can also be used to solve a system of [[nonlinear equation]]s. Below is an example that shows how to use the gradient descent to solve for three unknown variables, ''x''<sub>1</sub>, ''x''<sub>2</sub>, and ''x''<sub>3</sub>. This example shows one iteration of the gradient descent. Consider the nonlinear system of equations :<math> \begin{cases} 3x_1-\cos(x_2x_3)-\tfrac{3}{2} =0 \\ 4x_1^2-625x_2^2+2x_2-1 = 0 \\ \exp(-x_1x_2)+20x_3+\tfrac{10\pi-3}{3} =0 \end{cases}</math> Let us introduce the associated function :<math>G(\mathbf{x}) = \begin{bmatrix} 3x_1-\cos(x_2x_3)-\tfrac{3}{2} \\ 4x_1^2-625x_2^2+2x_2-1 \\ \exp(-x_1x_2)+20x_3+\tfrac{10\pi-3}{3} \\ \end{bmatrix}, </math> where :<math> \mathbf{x} =\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \end{bmatrix}.</math> One might now define the objective function :<math>\begin{align}F(\mathbf{x}) &= \frac{1}{2} G^\mathrm{T}(\mathbf{x}) G(\mathbf{x}) \\&=\frac{1}{2} \left[ \left (3x_1-\cos(x_2x_3)-\frac{3}{2} \right)^2 + \left(4x_1^2-625x_2^2+2x_2-1 \right)^2 +\right.\\ &{}\qquad\left. \left(\exp(-x_1x_2) + 20x_3 + \frac{10\pi-3}{3} \right)^2 \right],\end{align}</math> which we will attempt to minimize. As an initial guess, let us use :<math> \mathbf{x}^{(0)}= \mathbf{0} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ \end{bmatrix}.</math> We know that :<math>\mathbf{x}^{(1)}=\mathbf{0}-\gamma_0 \nabla F(\mathbf{0}) = \mathbf{0}-\gamma_0 J_G(\mathbf{0})^\mathrm{T} G(\mathbf{0}),</math> where the [[Jacobian matrix]] <math>J_G</math> is given by :<math>J_G(\mathbf{x}) = \begin{bmatrix} 3 & \sin(x_2x_3)x_3 & \sin(x_2x_3)x_2 \\ 8x_1 & -1250x_2+2 & 0 \\ -x_2\exp{(-x_1x_2)} & -x_1\exp(-x_1x_2) & 20\\ \end{bmatrix}.</math> We calculate: :<math>J_G(\mathbf{0}) = \begin{bmatrix} 3 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 20 \end{bmatrix}, \qquad G(\mathbf{0}) = \begin{bmatrix} -2.5\\ -1\\ 10.472 \end{bmatrix}.</math> Thus :<math>\mathbf{x}^{(1)}= \mathbf{0}-\gamma_0 \begin{bmatrix} -7.5\\ -2\\ 209.44 \end{bmatrix},</math> and :<math>F(\mathbf{0}) = 0.5 \left( (-2.5)^2 + (-1)^2 + (10.472)^2 \right) = 58.456.</math> [[File:Gradient Descent Example Nonlinear Equations.gif|thumb|right|350px|An animation showing the first 83 iterations of gradient descent applied to this example. Surfaces are [[isosurface]]s of <math>F(\mathbf{x}^{(n)})</math> at current guess <math>\mathbf{x}^{(n)}</math>, and arrows show the direction of descent. Due to a small and constant step size, the convergence is slow.]] Now, a suitable <math>\gamma_0</math> must be found such that :<math>F\left (\mathbf{x}^{(1)}\right ) \le F\left (\mathbf{x}^{(0)}\right ) = F(\mathbf{0}).</math> This can be done with any of a variety of [[line search]] algorithms. One might also simply guess <math>\gamma_0=0.001,</math> which gives :<math> \mathbf{x}^{(1)}=\begin{bmatrix} 0.0075 \\ 0.002 \\ -0.20944 \\ \end{bmatrix}.</math> Evaluating the objective function at this value, yields :<math>F \left (\mathbf{x}^{(1)}\right ) = 0.5 \left ((-2.48)^2 + (-1.00)^2 + (6.28)^2 \right ) = 23.306.</math> The decrease from <math>F(\mathbf{0})=58.456</math> to the next step's value of :<math> F\left (\mathbf{x}^{(1)}\right ) =23.306 </math> is a sizable decrease in the objective function. Further steps would reduce its value further until an approximate solution to the system was found.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)