Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Chain rule
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Applications == [[File:Chain rule en.png|thumb|upright=1.6|The chain rule in case of composites of more than two functions]] === Composites of more than two functions === The chain rule can be applied to composites of more than two functions. To take the derivative of a composite of more than two functions, notice that the composite of {{mvar|f}}, {{mvar|g}}, and ''{{mvar|h}}'' (in that order) is the composite of {{mvar|f}} with {{math|''g'' β ''h''}}. The chain rule states that to compute the derivative of {{math|''f'' β ''g'' β ''h''}}, it is sufficient to compute the derivative of ''{{mvar|f}}'' and the derivative of {{math|''g'' β ''h''}}. The derivative of {{mvar|f}} can be calculated directly, and the derivative of {{math|''g'' β ''h''}} can be calculated by applying the chain rule again.{{citation needed|date=November 2023}} For concreteness, consider the function <math display="block">y = e^{\sin (x^2)}.</math> This can be decomposed as the composite of three functions: <math display="block">\begin{align} y &= f(u) = e^u, \\ u &= g(v) = \sin v, \\ v &= h(x) = x^2. \end{align}</math> So that <math> y = f(g(h(x))) </math>. Their derivatives are: <math display="block">\begin{align} \frac{dy}{du} &= f'(u) = e^u, \\ \frac{du}{dv} &= g'(v) = \cos v, \\ \frac{dv}{dx} &= h'(x) = 2x. \end{align}</math> The chain rule states that the derivative of their composite at the point {{math|1=''x'' = ''a''}} is: <math display="block">\begin{align} (f \circ g \circ h)'(a) & = f'((g \circ h)(a)) \cdot (g \circ h)'(a) \\ & = f'((g \circ h)(a)) \cdot g'(h(a)) \cdot h'(a) \\ & = (f' \circ g \circ h)(a) \cdot (g' \circ h)(a) \cdot h'(a). \end{align}</math> In [[Leibniz's notation]], this is: <math display="block">\frac{dy}{dx} = \left.\frac{dy}{du}\right|_{u=g(h(a))}\cdot\left.\frac{du}{dv}\right|_{v=h(a)}\cdot\left.\frac{dv}{dx}\right|_{x=a},</math> or for short, <math display="block">\frac{dy}{dx} = \frac{dy}{du}\cdot\frac{du}{dv}\cdot\frac{dv}{dx}.</math> The derivative function is therefore: <math display="block">\frac{dy}{dx} = e^{\sin(x^2)}\cdot\cos(x^2)\cdot 2x.</math> Another way of computing this derivative is to view the composite function {{math|''f'' β ''g'' β ''h''}} as the composite of {{math|''f'' β ''g''}} and ''h''. Applying the chain rule in this manner would yield: <math display="block">\begin{align} (f \circ g \circ h)'(a) &= (f \circ g)'(h(a)) \cdot h'(a) \\ &= f'(g(h(a))) \cdot g'(h(a)) \cdot h'(a). \end{align}</math> This is the same as what was computed above. This should be expected because {{math|1=(''f'' β ''g'') β ''h'' = ''f'' β (''g'' β ''h'')}}. Sometimes, it is necessary to differentiate an arbitrarily long composition of the form <math>f_1 \circ f_2 \circ \cdots \circ f_{n-1} \circ f_n\!</math>. In this case, define <math display="block">f_{a\,.\,.\,b} = f_{a} \circ f_{a+1} \circ \cdots \circ f_{b-1} \circ f_{b}</math> where <math>f_{a\,.\,.\,a} = f_a</math> and <math>f_{a\,.\,.\,b}(x) = x</math> when <math>b < a</math>. Then the chain rule takes the form <math display="block">\begin{align} Df_{1\,.\,.\,n} &= (Df_1 \circ f_{2\,.\,.\,n}) (Df_2 \circ f_{3\,.\,.\,n}) \cdots (Df_{n-1} \circ f_{n\,.\,.\,n}) Df_n \\ &= \prod_{k=1}^n \left[Df_k \circ f_{(k+1)\,.\,.\,n}\right] \end{align}</math> or, in the Lagrange notation, <math display="block">\begin{align} f_{1\,.\,.\,n}'(x) &= f_1' \left( f_{2\,.\,.\,n}(x) \right) \; f_2' \left( f_{3\,.\,.\,n}(x) \right) \cdots f_{n-1}' \left(f_{n\,.\,.\,n}(x)\right) \; f_n'(x) \\[1ex] &= \prod_{k=1}^{n} f_k' \left(f_{(k+1\,.\,.\,n)}(x) \right) \end{align}</math> === Quotient rule === {{See also|Quotient rule}} The chain rule can be used to derive some well-known differentiation rules. For example, the quotient rule is a consequence of the chain rule and the [[product rule]]. To see this, write the function {{math|''f''(''x'')/''g''(''x'')}} as the product {{math|''f''(''x'') Β· 1/''g''(''x'')}}. First apply the product rule: <math display="block">\begin{align} \frac{d}{dx}\left(\frac{f(x)}{g(x)}\right) &= \frac{d}{dx}\left(f(x)\cdot\frac{1}{g(x)}\right) \\ &= f'(x)\cdot\frac{1}{g(x)} + f(x)\cdot\frac{d}{dx}\left(\frac{1}{g(x)}\right). \end{align}</math> To compute the derivative of {{math|1/''g''(''x'')}}, notice that it is the composite of {{mvar|g}} with the reciprocal function, that is, the function that sends {{mvar|x}} to {{math|1/''x''}}. The derivative of the reciprocal function is <math>-1/x^2\!</math>. By applying the chain rule, the last expression becomes: <math display="block">f'(x)\cdot\frac{1}{g(x)} + f(x)\cdot\left(-\frac{1}{g(x)^2}\cdot g'(x)\right) = \frac{f'(x) g(x) - f(x) g'(x)}{g(x)^2},</math> which is the usual formula for the quotient rule. === Derivatives of inverse functions === {{Main|Inverse functions and differentiation}} Suppose that {{math|1=''y'' = ''g''(''x'')}} has an [[inverse function]]. Call its inverse function {{mvar|f}} so that we have {{math|1=''x'' = ''f''(''y'')}}. There is a formula for the derivative of {{mvar|f}} in terms of the derivative of {{mvar|g}}. To see this, note that {{mvar|f}} and {{mvar|g}} satisfy the formula <math display="block">f(g(x)) = x.</math> And because the functions <math>f(g(x))</math> and {{mvar|x}} are equal, their derivatives must be equal. The derivative of {{mvar|x}} is the constant function with value 1, and the derivative of <math>f(g(x))</math> is determined by the chain rule. Therefore, we have that: <math display="block">f'(g(x)) g'(x) = 1.</math> To express {{mvar|f'}} as a function of an independent variable {{mvar|y}}, we substitute <math>f(y)</math> for {{mvar|x}} wherever it appears. Then we can solve for {{mvar|f'}}. <math display="block">\begin{align} f'(g(f(y))) g'(f(y)) &= 1 \\ f'(y) g'(f(y)) &= 1 \\ f'(y) = \frac{1}{g'(f(y))}. \end{align}</math> For example, consider the function {{math|1=''g''(''x'') = ''e''<sup>''x''</sup>}}. It has an inverse {{math|1=''f''(''y'') = ln ''y''}}. Because {{math|1=''g''β²(''x'') = ''e''<sup>''x''</sup>}}, the above formula says that <math display="block">\frac{d}{dy}\ln y = \frac{1}{e^{\ln y}} = \frac{1}{y}.</math> This formula is true whenever {{mvar|g}} is differentiable and its inverse {{mvar|f}} is also differentiable. This formula can fail when one of these conditions is not true. For example, consider {{math|1=''g''(''x'') = ''x''<sup>3</sup>}}. Its inverse is {{math|1=''f''(''y'') = ''y''<sup>1/3</sup>}}, which is not differentiable at zero. If we attempt to use the above formula to compute the derivative of {{mvar|f}} at zero, then we must evaluate {{math|1=1/''g''β²(''f''(0))}}. Since {{math|1=''f''(0) = 0}} and {{math|1=''g''β²(0) = 0}}, we must evaluate 1/0, which is undefined. Therefore, the formula fails in this case. This is not surprising because {{mvar|f}} is not differentiable at zero. === Back propagation === The chain rule forms the basis of the [[back propagation]] algorithm, which is used in [[gradient descent]] of [[neural network (machine learning)|neural networks]] in [[deep learning]] ([[artificial intelligence]]).<ref>{{citation|title=Deep learning|first1=Ian|last1=Goodfellow|authorlink1=Ian Goodfellow|first2=Yoshua|last2=Bengio|authorlink2=Yoshua Bengio|first3=Aaron|last3=Courville|year=2016|publisher=MIT}}, pp=197β217.</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)