Editing Integration by substitution

{{Calculus|Integral}}
{{short description|Technique in integral evaluation}}

In [[calculus]], '''integration by substitution''', also known as '''''u''-substitution''', '''reverse chain rule''' or '''change of variables''',<ref>{{harvnb|Swokowski|1983|p=257}}</ref> is a method for evaluating [[integral]]s and [[antiderivative]]s. It is the counterpart to the [[chain rule]] for [[derivative|differentiation]], and can loosely be thought of as using the chain rule "backwards." This involves [[differential forms]]. 

== Substitution for a single variable ==

=== Introduction (indefinite integrals) ===
Before stating the result [[mathematical rigor|rigorously]], consider a simple case using [[indefinite integral|indefinite integrals]].

Compute <math display="inline">\int(2x^3+1)^7(x^2)\,dx.</math><ref>{{harvnb|Swokowski|1983|p=258}}</ref>

Set <math>u=2x^3+1.</math> This means <math display="inline">\frac{du}{dx}=6x^2,</math> or as a [[differential form]], <math display="inline">du=6x^2\,dx.</math> Now: 
<math display="block">\begin{aligned}
    \int(2x^3 +1)^7(x^2)\,dx
    &= \frac{1}{6}\int\underbrace{(2x^3+1)^{7}}_{u^{7}}\underbrace{(6x^2)\,dx}_{du} \\
    &= \frac{1}{6}\int u^{7}\,du \\
    &= \frac{1}{6}\left(\frac{1}{8}u^{8}\right)+C \\
    &= \frac{1}{48}(2x^3+1)^{8}+C,
\end{aligned}</math>
where <math>C</math> is an arbitrary [[constant of integration]].

This procedure is frequently used, but not all integrals are of a form that permits its use. In any event, the result should be verified by differentiating and comparing to the original integrand.
<math display="block">\frac{d}{dx}\left[\frac{1}{48}(2x^3+1)^{8}+C\right] = \frac{1}{6}(2x^3+1)^{7}(6x^2) = (2x^3+1)^7(x^2).</math>
For definite integrals, the limits of integration must also be adjusted, but the procedure is mostly the same.

=== Statement for definite integrals ===
Let <math>g:[a,b]\to I</math> be a [[differentiable function]] with a [[continuous function|continuous]] derivative, where <math>I \subset \mathbb{R}</math> is an [[interval (mathematics)|interval]]. Suppose that <math>f:I\to\mathbb{R}</math> is a [[continuous function]]. Then:<ref>{{harvnb|Briggs|Cochran|2011|p=361}}</ref>
<math display="block">\int_a^b f(g(x))\cdot g'(x)\, dx = \int_{g(a)}^{g(b)} f(u)\ du. </math>

In Leibniz notation, the substitution <math>u=g(x)</math> yields: 
<math display="block">\frac{du}{dx} = g'(x).</math>
Working heuristically with [[infinitesimal]]s yields the equation
<math display="block">du = g'(x)\,dx,</math>
which suggests the substitution formula above.  (This equation may be put on a rigorous foundation by interpreting it as a statement about [[differential form]]s.)  One may view the method of integration by substitution as a partial justification of [[Leibniz's notation]] for integrals and derivatives.

The formula is used to transform one integral into another integral that is easier to compute. Thus, the formula can be read from left to right or from right to left in order to simplify a given integral. When used in the former manner, it is sometimes known as '''''u''-substitution''' or '''''w''-substitution''' in which a new variable is defined to be a function of the original variable found inside the [[function composition|composite]] function multiplied by the derivative of the inner function. The latter manner is commonly used in [[trigonometric substitution]], replacing the original variable with a [[trigonometric function]] of a new variable and the original [[differential of a function|differential]] with the differential of the trigonometric function.

=== Proof ===

Integration by substitution can be derived from the [[fundamental theorem of calculus]] as follows.  Let <math>f</math> and <math>g</math> be two functions satisfying the above hypothesis that <math>f</math> is continuous on <math>I</math> and <math>g'</math> is integrable on the closed interval <math>[a,b]</math>.  Then the function <math>f(g(x))\cdot g'(x)</math> is also integrable on <math>[a,b]</math>.  Hence the integrals
<math display="block">\int_a^b f(g(x))\cdot g'(x)\ dx</math>
and
<math display="block">\int_{g(a)}^{g(b)} f(u)\ du</math>
in fact exist, and it remains to show that they are equal.

Since <math>f</math> is continuous, it has an [[antiderivative]] <math>F</math>. The [[function composition|composite function]] <math>F \circ g</math> is then defined. Since <math>g</math> is differentiable, combining the [[chain rule]] and the definition of an antiderivative gives:
<math display="block">(F \circ g)'(x) = F'(g(x)) \cdot g'(x) = f(g(x)) \cdot g'(x).</math>

Applying the [[fundamental theorem of calculus]] twice gives:
<math display="block">\begin{align}
\int_a^b f(g(x)) \cdot g'(x)\ dx
&= \int_a^b (F \circ g)'(x)\ dx \\
&= (F \circ g)(b) - (F \circ g)(a) \\
&= F(g(b)) - F(g(a)) \\
&= \int_{g(a)}^{g(b)} f(u)\, du,
\end{align}</math>
which is the substitution rule.

=== Examples: Antiderivatives (indefinite integrals) ===

Substitution can be used to determine [[antiderivative]]s. One chooses a relation between <math>x</math> and <math>u,</math> determines the corresponding relation between <math>dx</math> and <math>du</math> by differentiating, and performs the substitutions. An antiderivative for the substituted function can hopefully be determined; the original substitution between <math>x</math> and <math>u</math> is then undone.

==== Example 1 ====
Consider the integral:
<math display="block">\int x \cos(x^2+1)\ dx.</math>
Make the substitution <math display="inline">u = x^{2} + 1</math> to obtain <math>du = 2x\ dx,</math> meaning <math display="inline">x\ dx = \frac{1}{2}\ du.</math> Therefore:
<math display="block">\begin{align}
\int x \cos(x^2+1) \,dx
&= \frac{1}{2} \int 2x \cos(x^2+1) \,dx \\[6pt]
&= \frac{1}{2} \int\cos u\,du \\[6pt]
&= \frac{1}{2}\sin u + C \\[6pt]
&= \frac{1}{2}\sin(x^2+1) + C,
\end{align}</math>
where <math>C</math> is an arbitrary [[constant of integration]].

==== Example 2: Antiderivatives of tangent and cotangent ====

The [[tangent function]] can be integrated using substitution by expressing it in terms of the sine and cosine: <math>\tan x = \tfrac{\sin x}{\cos x}</math>.

Using the substitution <math>u = \cos x</math> gives <math>du = -\sin x\,dx</math> and
<math display="block">\begin{align}
   \int \tan x \,dx &= \int \frac{\sin x}{\cos x} \,dx \\
    &= \int -\frac{du}{u} \\
    &= -\ln \left|u\right| + C \\
    &= -\ln \left|\cos x\right| + C \\
    &= \ln \left|\sec x\right| + C.
 \end{align}</math>

The [[cotangent function]] can be integrated similarly by expressing it as <math>\cot x = \tfrac{\cos x}{\sin x}</math> and using the substitution <math>u = \sin{x}, du = \cos{x}\,dx</math>:
<math display="block">\begin{align}
   \int \cot x \,dx &= \int \frac{\cos x}{\sin x} \,dx \\
    &= \int \frac{du}{u} \\
    &= \ln \left|u\right| + C \\
    &= \ln \left|\sin x\right| + C.
 \end{align}</math>

=== Examples: Definite integrals ===
When evaluating definite integrals by substitution, one may calculate the antiderivative fully first, then apply the boundary conditions. In that case, there is no need to transform the boundary terms. Alternatively, one may fully evaluate the indefinite integral ([[Integration by substitution#Examples: Antiderivatives|see above]]) first then apply the boundary conditions. This becomes especially handy when multiple substitutions are used.

==== Example 1 ====
Consider the integral:
<math display="block">\int_0^2 \frac{x}{\sqrt{x^2+1}} dx.</math>
Make the substitution <math display="inline">u = x^{2} + 1</math> to obtain <math>du = 2x\ dx,</math> meaning <math display="inline">x\ dx = \frac{1}{2}\ du.</math> Therefore:
<math display="block">\begin{align}
\int_{x=0}^{x=2} \frac{x}{\sqrt{x^2+1}} \ dx
&= \frac{1}{2} \int_{u=1}^{u=5} \frac{du}{\sqrt{u}} \\[6pt]
&= \frac{1}{2} \left(2\sqrt{5}-2\sqrt{1}\right) \\[6pt]
&= \sqrt{5}-1.
\end{align}</math>
Since the lower limit <math>x = 0</math> was replaced with <math>u = 1,</math> and the upper limit <math>x = 2</math> with <math>2^{2} + 1 = 5,</math> a transformation back into terms of <math>x</math> was unnecessary.

==== Example 2: [[Trigonometric substitution]] ====

For the integral
<math display="block">\int_0^1 \sqrt{1-x^2}\,dx,</math>
a variation of the above procedure is needed. The substitution <math>x = \sin u</math> implying <math>dx = \cos u \,du</math> is useful because <math display="inline">\sqrt{1-\sin^2 u} = \cos u.</math> We thus have:
<math display="block">\begin{align}
\int_0^1 \sqrt{1-x^2}\ dx
&= \int_0^{\pi/2} \sqrt{1-\sin^2 u} \cos u\ du \\[6pt]
&= \int_0^{\pi/2} \cos^2 u\ du \\[6pt]
&= \left[\frac{u}{2} + \frac{\sin(2u)}{4}\right]_0^{\pi/2} \\[6pt]
&= \frac{\pi}{4} + 0 \\[6pt]
&= \frac{\pi}{4}.
\end{align}</math>

The resulting integral can be computed using [[integration by parts]] or a [[List of trigonometric identities#Multiple-angle and half-angle formulae|double angle formula]], <math display="inline">2\cos^{2} u = 1 + \cos (2u),</math> followed by one more substitution. One can also note that the function being integrated is the upper right quarter of a circle with a radius of one, and hence integrating the upper right quarter from zero to one is the geometric equivalent to the area of one quarter of the unit circle, or <math>\tfrac \pi 4.</math>

== Substitution for multiple variables ==

One may also use substitution when integrating [[Multivariate function|functions of several variables]].

Here, the substitution function {{math|1=(''v''<sub>1</sub>,...,''v''<sub>''n''</sub>) = ''φ''(''u''<sub>1</sub>, ..., ''u''<sub>''n''</sub>)}} needs to be [[injective]] and continuously differentiable, and the differentials transform as:
<math display="block">dv_1 \cdots dv_n = \left|\det(D\varphi)(u_1, \ldots, u_n)\right| \, du_1 \cdots du_n,</math>
where {{math|det(''Dφ'')(''u''<sub>1</sub>, ..., ''u''<sub>''n''</sub>)}} denotes the [[determinant]] of the [[Jacobian matrix]] of [[partial derivative]]s of {{math|''φ''}} at the point {{math|(''u''<sub>1</sub>, ..., ''u''<sub>''n''</sub>)}}. This formula expresses the fact that the [[absolute value]] of the determinant of a matrix equals the volume of the [[Parallelepiped#Parallelotope|parallelotope]] spanned by its columns or rows.

More precisely, the ''[[change of variables]]'' formula is stated in the next theorem:

{{math theorem | math_statement = Let {{math|''U''}} be an open set in {{math|'''R'''<sup>''n''</sup>}} and {{math|''φ'' : ''U'' → '''R'''<sup>''n''</sup>}} an [[Injective function|injective]] differentiable function with continuous partial derivatives, the Jacobian of which is nonzero for every {{mvar|x}} in {{mvar|U}}.  Then for any real-valued, compactly supported, continuous function {{mvar|f}}, with support contained in {{math|''φ''(''U'')}}:
<math display="block">\int_{\varphi(U)} f(\mathbf{v})\, d\mathbf{v} = \int_U f(\varphi(\mathbf{u})) \,\,\left|\!\det(D\varphi)(\mathbf{u})\right| \,d\mathbf{u}.</math>
}}

The conditions on the theorem can be weakened in various ways. First, the requirement that {{mvar|φ}} be continuously differentiable can be replaced by the weaker assumption that {{mvar|φ}} be merely differentiable and have a continuous inverse.<ref>{{harvnb|Rudin|1987|loc=Theorem 7.26}}</ref> This is guaranteed to hold if {{mvar|φ}} is continuously differentiable by the [[inverse function theorem]].  Alternatively, the requirement that {{math|det(''Dφ'') ≠ 0}} can be eliminated by applying [[Sard's theorem]].<ref>{{harvnb|Spivak|1965|p=72}}</ref>

For Lebesgue measurable functions, the theorem can be stated in the following form:<ref>{{harvnb|Fremlin|2010|loc=Theorem 263D}}</ref>

{{math theorem | math_statement = Let {{mvar|U}} be a measurable subset of {{math|'''R'''<sup>''n''</sup>}} and {{math|''φ'' : ''U'' → '''R'''<sup>''n''</sup>}} an [[injective function]], and suppose for every {{mvar|x}} in {{mvar|U}} there exists {{math|''φ''&prime;(''x'')}} in {{math|'''R'''<sup>''n'',''n''</sup>}} such that {{math|1=''φ''(''y'') = ''φ''(''x'') + ''φ&prime;''(''x'')(''y'' − ''x'') + ''o''({{norm|''y'' − ''x''}})}} as {{math|''y'' → ''x''}} (here {{mvar|o}} is [[Landau symbol#Related asymptotic notations|little-''o'' notation]]). Then {{math|''φ''(''U'')}} is measurable, and for any real-valued function {{mvar|f}} defined on {{math|''φ''(''U'')}}:
<math display="block">\int_{\varphi(U)} f(v)\, dv = \int_U f(\varphi(u)) \,\,\left|\!\det \varphi'(u)\right| \,du</math>
in the sense that if either integral exists (including the possibility of being properly infinite), then so does the other one, and they have the same value.}}

Another very general version in [[measure theory]] is the following:<ref>{{harvnb|Hewitt|Stromberg|1965|loc=Theorem 20.3}}</ref>

{{math theorem | math_statement =  Let {{mvar|X}} be a [[locally compact]] [[Hausdorff space]] equipped with a finite [[Radon measure]] {{mvar|μ}}, and let {{mvar|Y}} be a [[Σ-compact space|σ-compact]] Hausdorff space with a [[sigma finite measure|&sigma;-finite]] Radon measure {{mvar|ρ}}.  Let {{math|''φ'' : ''X'' → ''Y''}} be an [[absolutely continuous]] function (where the latter means that {{math|1=''ρ''(''φ''(''E'')) = 0}} whenever {{math|1=''μ''(''E'') = 0}}). Then there exists a real-valued [[Borel algebra|Borel measurable function]] {{mvar|w}} on {{mvar|X}} such that for every [[Lebesgue integral|Lebesgue integrable]] function {{math|''f'' : ''Y'' → '''R'''}}, the function {{math|(''f'' ∘ ''φ'') ⋅ ''w''}} is Lebesgue integrable on {{mvar|X}}, and
<math display="block">\int_Y f(y)\,d\rho(y) = \int_X (f\circ \varphi)(x)\,w(x)\,d\mu(x).</math>
Furthermore, it is possible to write
<math display="block">w(x) = (g\circ \varphi)(x)</math>
for some Borel measurable function {{mvar|g}} on {{mvar|Y}}.}}

In [[geometric measure theory]], integration by substitution is used with [[Lipschitz function]]s. A bi-Lipschitz function is a Lipschitz function {{math|''φ'' : ''U'' → '''R'''<sup>n</sup>}} which is injective and whose inverse function {{math|''φ''<sup>&minus;1</sup> : ''φ''(''U'') → ''U''}} is also Lipschitz.  By [[Rademacher's theorem]], a bi-Lipschitz mapping is differentiable [[almost everywhere]]. In particular, the Jacobian determinant of a bi-Lipschitz mapping {{math|det ''Dφ''}} is well-defined almost everywhere.  The following result then holds:

{{math theorem | math_statement = Let {{mvar|U}} be an open subset of {{math|'''R'''<sup>n</sup>}} and {{math|''φ'' : ''U'' → '''R'''<sup>n</sup>}} be a bi-Lipschitz mapping.  Let {{math|''f'' : ''φ''(''U'') → '''R'''}} be measurable.  Then
<math display="block">\int_{\varphi(U)} f(x)\,dx = \int_U (f\circ \varphi)(x) \,\,\left|\!\det D\varphi(x)\right|\,dx</math>
in the sense that if either integral exists (or is properly infinite), then so does the other one, and they have the same value.}}

The above theorem was first proposed by [[Euler]] when he developed the notion of [[double integrals]] in 1769. Although generalized to triple integrals by [[Lagrange]] in 1773, and used by [[Adrien-Marie Legendre|Legendre]], [[Laplace]], and [[Gauss]], and first generalized to {{mvar|n}} variables by [[Mikhail Ostrogradsky]] in 1836, it resisted a fully rigorous formal proof for a surprisingly long time, and was first satisfactorily resolved 125 years later, by [[Élie Cartan]] in a series of papers beginning in the mid-1890s.<ref>{{harvnb|Katz|1982}}</ref><ref>{{harvnb|Ferzola|1994}}</ref>

==Application in probability==

Substitution can be used to answer the following important question in probability: given a random variable {{mvar|X}} with probability density {{math|''p''<sub>''X''</sub>}} and another random variable {{mvar|Y}} such that {{mvar|1= ''Y''= ''ϕ''(''X'')}} for [[Injective function|injective]] (one-to-one) {{mvar|''ϕ'',}} what is the probability density for {{mvar|Y}}?

It is easiest to answer this question by first answering a slightly different question: what is the probability that {{mvar|Y}} takes a value in some particular subset {{mvar|S}}? Denote this probability {{math|''P''(''Y'' &isin; ''S'').}}  Of course, if {{mvar|Y}} has probability density {{math|''p''<sub>''Y''</sub>}}, then the answer is:
<math display="block">P(Y \in S) = \int_S p_Y(y)\,dy,</math>
but this is not really useful because we do not know {{math|''p''<sub>''Y''</sub>;}} it is what we are trying to find. We can make progress by considering the problem in the variable {{mvar|X}}. {{mvar|Y}} takes a value in {{mvar|S}} whenever {{mvar|X}} takes a value in <math display="inline">\phi^{-1}(S),</math> so:
<math display="block">P(Y \in S) = P(X \in \phi^{-1}(S)) = \int_{\phi^{-1}(S)} p_X(x)\,dx.</math>

Changing from variable {{mvar|x}} to {{mvar|y}} gives:
<math display="block">P(Y \in S) = \int_{\phi^{-1}(S)} p_X(x)\,dx = \int_S p_X(\phi^{-1}(y)) \left|\frac{d\phi^{-1}}{dy}\right|\,dy.</math>
Combining this with our first equation gives:
<math display="block">\int_S p_Y(y)\,dy = \int_S p_X(\phi^{-1}(y)) \left|\frac{d\phi^{-1}}{dy}\right|\,dy,</math>
so:
<math display="block">p_Y(y) = p_X(\phi^{-1}(y)) \left|\frac{d\phi^{-1}}{dy}\right|.</math>

In the case where {{mvar|X}} and {{mvar|Y}} depend on several uncorrelated variables (i.e., <math display="inline">p_X=p_X(x_1, \ldots, x_n)</math> and <math>y=\phi(x)</math>), <math>p_Y</math>can be found by substitution in several variables discussed above. The result is:
<math display="block">p_Y(y) = p_X(\phi^{-1}(y)) \left|\det D\phi ^{-1}(y) \right|.</math>

==See also==
{{Portal|Mathematics}}
*[[Probability density function]]
*[[Substitution of variables]]
*[[Trigonometric substitution]]
*[[Weierstrass substitution]]
*[[Euler substitution]]
*[[Glasser's master theorem]]
*[[Pushforward measure]]

==Notes==
{{reflist|30em}}

==References==
{{refbegin}}
*{{citation|first1=William|last1=Briggs|first2=Lyle|last2=Cochran|year=2011|title=Calculus /Early Transcendentals|edition=Single Variable|publisher=Addison-Wesley|isbn=978-0-321-66414-3}}
*{{citation|first=Anthony P.|last=Ferzola|url=http://mathdl.maa.org/mathDL/22/?pa=content&sa=viewDocument&nodeId=2688|title=Euler and differentials|journal=[[The College Mathematics Journal]]|volume=25|issue=2|year=1994|pages=102&ndash;111|doi=10.2307/2687130|jstor=2687130|access-date=2008-12-24|archive-date=2012-11-07|archive-url=https://web.archive.org/web/20121107082150/http://mathdl.maa.org/mathDL/22/?pa=content&sa=viewDocument&nodeId=2688|url-status=dead|url-access=subscription}}
* {{citation|first=D.H.|last=Fremlin|title=Measure Theory, Volume 2|publisher=Torres Fremlin|year=2010|isbn=978-0-9538129-7-4}}.
* {{citation|first1=Edwin|last1=Hewitt|first2=Karl|last2=Stromberg|author-link1=Edwin Hewitt|title=Real and Abstract Analysis|publisher=Springer-Verlag|year=1965|isbn=978-0-387-04559-7}}.
* {{citation|first=V.|last=Katz|title=Change of variables in multiple integrals: Euler to Cartan|journal=[[Mathematics Magazine]]|volume=55|year=1982|pages=3&ndash;11|doi=10.2307/2689856|issue=1|jstor=2689856}}
* {{citation|first=Walter|last=Rudin|author-link=Walter Rudin|title=Real and Complex Analysis|publisher=McGraw-Hill|year=1987|isbn=978-0-07-054234-1}}.
* {{citation|first=Earl W.|last=Swokowski|title=Calculus with analytic geometry|edition=alternate|year=1983|publisher=Prindle, Weber & Schmidt|isbn=0-87150-341-7}}
* {{citation|first=Michael|last=Spivak|author-link=Michael Spivak|title=Calculus on Manifolds|publisher=Westview Press|year=1965|isbn=978-0-8053-9021-6}}.
{{refend}}

==External links==
{{Wikibooks|Calculus|Integration#The_Substitution_Rule|The Substitution Rule}}
{{Wikiversity|Integration by Substitution}}
* [https://www.encyclopediaofmath.org/index.php/Integration_by_substitution Integration by substitution] at [[Encyclopedia of Mathematics]]
* [https://www.encyclopediaofmath.org/index.php/Area_formula Area formula] at [[Encyclopedia of Mathematics]]

{{Calculus topics}}
{{Integrals}}

[[Category:Articles containing proofs]]
[[Category:Integral calculus]]

[[es:Métodos de integración#Método de integración por sustitución]]