Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Automatic differentiation
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Two types of automatic differentiation === Usually, two distinct modes of automatic differentiation are presented. * '''forward accumulation''' (also called '''bottom-up''', '''forward mode''', or '''tangent mode''') * '''reverse accumulation''' (also called '''top-down''', '''reverse mode''', or '''adjoint mode''') Forward accumulation specifies that one traverses the chain rule from inside to outside (that is, first compute <math>\partial w_1/ \partial x</math> and then <math>\partial w_2/\partial w_1</math> and lastly <math>\partial y/\partial w_2</math>), while reverse accumulation traverses from outside to inside (first compute <math>\partial y/\partial w_2</math> and then <math>\partial w_2/\partial w_1</math> and lastly <math>\partial w_1/\partial x</math>). More succinctly, * Forward accumulation computes the recursive relation: <math>\frac{\partial w_i}{\partial x} = \frac{\partial w_i}{\partial w_{i-1}} \frac{\partial w_{i-1}}{\partial x}</math> with <math>w_3 = y</math>, and, * Reverse accumulation computes the recursive relation: <math>\frac{\partial y}{\partial w_i} = \frac{\partial y}{\partial w_{i+1}} \frac{\partial w_{i+1}}{\partial w_{i}}</math> with <math>w_0 = x</math>. The value of the partial derivative, called the ''seed'', is propagated forward or backward and is initially <math>\frac{\partial x}{\partial x}=1</math> or <math>\frac{\partial y}{\partial y}=1</math>. Forward accumulation evaluates the function and calculates the derivative with respect to one independent variable in one pass. For each independent variable <math>x_1,x_2,\dots,x_n</math> a separate pass is therefore necessary in which the derivative with respect to that independent variable is set to one (<math>\frac{\partial x_1}{\partial x_1}=1</math>) and of all others to zero (<math>\frac{\partial x_2}{\partial x_1}= \dots = \frac{\partial x_n}{\partial x_1} = 0</math>). In contrast, reverse accumulation requires the evaluated partial functions for the partial derivatives. Reverse accumulation therefore evaluates the function first and calculates the derivatives with respect to all independent variables in an additional pass. Which of these two types should be used depends on the sweep count. The [[Computational complexity theory|computational complexity]] of one sweep is proportional to the complexity of the original code. * Forward accumulation is more efficient than reverse accumulation for functions {{math|''f'' : '''R'''<sup>''n''</sup> β '''R'''<sup>''m''</sup>}} with {{math|''n'' βͺ ''m''}} as only {{math|''n''}} sweeps are necessary, compared to {{math|''m''}} sweeps for reverse accumulation. * Reverse accumulation is more efficient than forward accumulation for functions {{math|''f'' : '''R'''<sup>''n''</sup> β '''R'''<sup>''m''</sup>}} with {{math|''n'' β« ''m''}} as only {{math|''m''}} sweeps are necessary, compared to {{math|''n''}} sweeps for forward accumulation. [[Backpropagation]] of errors in multilayer perceptrons, a technique used in [[machine learning]], is a special case of reverse accumulation.<ref name="baydin2018automatic" /> Forward accumulation was introduced by R.E. Wengert in 1964.<ref name="Wengert1964"/> According to Andreas Griewank, reverse accumulation has been suggested since the late 1960s, but the inventor is unknown.<ref name="grie2012">{{cite book |last=Griewank |first=Andreas |title=Optimization Stories |chapter=Who invented the reverse mode of differentiation? |year=2012 |series=Documenta Mathematica Series |volume= 6|pages=389β400 |doi=10.4171/dms/6/38 |doi-access=free |isbn=978-3-936609-58-5 |chapter-url=https://ftp.gwdg.de/pub/misc/EMIS/journals/DMJDMV/vol-ismp/52_griewank-andreas-b.pdf }}</ref> [[Seppo Linnainmaa]] published reverse accumulation in 1976.<ref name="lin1976">{{cite journal |last=Linnainmaa |first=Seppo |year=1976 |title=Taylor Expansion of the Accumulated Rounding Error |journal=BIT Numerical Mathematics |volume=16 |issue=2 |pages=146β160 |doi=10.1007/BF01931367 |s2cid=122357351 }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)