Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Gradient descent
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Optimization algorithm}} {{About||the analytical method called "steepest descent"|Method of steepest descent}} {{Machine learning}} [[File:Gradient Descent in 2D.webm|thumb|right|Gradient Descent in 2D]] '''Gradient descent''' is a method for unconstrained [[mathematical optimization]]. It is a [[:Category:First order methods|first-order]] [[Iterative algorithm|iterative]] [[algorithm]] for minimizing a [[differentiable function|differentiable]] [[multivariate function]]. The idea is to take repeated steps in the opposite direction of the [[gradient]] (or approximate gradient) of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a trajectory that maximizes that function; the procedure is then known as ''gradient ascent''. It is particularly useful in machine learning for minimizing the cost or loss function.<ref name="auto">{{Cite book |last1=Boyd |first1=Stephen |url=http://dx.doi.org/10.1017/cbo9780511804441 |title=Convex Optimization |last2=Vandenberghe |first2=Lieven |date=2004-03-08 |publisher=Cambridge University Press |doi=10.1017/cbo9780511804441 |isbn=978-0-521-83378-3}}</ref> Gradient descent should not be confused with [[Local search (optimization)|local search]] algorithms, although both are [[Iterative method|iterative methods]] for [[Global optimization|optimization]]. Gradient descent is generally attributed to [[Augustin-Louis Cauchy]], who first suggested it in 1847.<ref>{{cite book|vauthors=((Lemaréchal, C.)) | title=Optimization Stories | veditors=((Grötschel, M.)) | date=1 January 2012 | chapter=Cauchy and the gradient method | series=Documenta Mathematica Series | publisher=EMS Press | volume=6 | edition=1st | pages=251–254 | chapter-url=https://www.math.uni-bielefeld.de/documenta/vol-ismp/40_lemarechal-claude.pdf| doi=10.4171/dms/6/27 | doi-access=free | isbn=978-3-936609-58-5|access-date=2020-01-26 |archive-date=2018-12-29 |archive-url=https://web.archive.org/web/20181229073335/https://www.math.uni-bielefeld.de/documenta/vol-ismp/40_lemarechal-claude.pdf |url-status=dead}}</ref> [[Jacques Hadamard]] independently proposed a similar method in 1907.<ref>{{Cite journal|last=Hadamard|first=Jacques|date=1908|title=Mémoire sur le problème d'analyse relatif à l'équilibre des plaques élastiques encastrées|journal=Mémoires présentés par divers savants éstrangers à l'Académie des Sciences de l'Institut de France|volume=33}}</ref><ref>{{cite journal |last1=Courant |first1=R. |title=Variational methods for the solution of problems of equilibrium and vibrations |journal=Bulletin of the American Mathematical Society |date=1943 |volume=49 |issue=1 |pages=1–23 |doi=10.1090/S0002-9904-1943-07818-4 |doi-access=free }}</ref> Its convergence properties for non-linear optimization problems were first studied by [[Haskell Curry]] in 1944,<ref>{{cite journal |first=Haskell B. |last=Curry |title=The Method of Steepest Descent for Non-linear Minimization Problems |journal=Quart. Appl. Math. |volume=2 |year=1944 |issue=3 |pages=258–261 |doi=10.1090/qam/10667 |doi-access=free }}</ref> with the method becoming increasingly well-studied and used in the following decades.<ref name="BP" /><ref name="AK82" /> A simple extension of gradient descent, [[stochastic gradient descent]], serves as the most basic algorithm used for training most [[deep neural network|deep network]]s today.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)