Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Inverse function theorem
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Statements== For functions of a single [[Variable (mathematics)|variable]], the theorem states that if <math>f</math> is a [[continuously differentiable]] function with nonzero derivative at the point <math>a</math>; then <math>f</math> is injective (or bijective onto the image) in a neighborhood of <math>a</math>, the inverse is continuously differentiable near <math>b=f(a)</math>, and the derivative of the inverse function at <math>b</math> is the reciprocal of the derivative of <math>f</math> at <math>a</math>: <math display=block>\bigl(f^{-1}\bigr)'(b) = \frac{1}{f'(a)} = \frac{1}{f'(f^{-1}(b))}.</math><!-- Not sure the meaning of the following alternative version; if the function is already injective, the theorem gives nothing: An alternate version, which assumes that <math>f</math> is [[Continuous function|continuous]] and [[Locally injective function|injective near {{Mvar|a}}]], and differentiable at {{Mvar|a}} with a non-zero derivative, will also result in <math>f</math> being invertible near {{Mvar|a}}, with an inverse that's similarly continuous and [[Injective function|injective]], and where the above formula would apply as well. --> It can happen that a function <math>f</math> may be injective near a point <math>a</math> while <math>f'(a) = 0</math>. An example is <math>f(x) = (x - a)^3</math>. In fact, for such a function, the inverse cannot be differentiable at <math>b = f(a)</math>, since if <math>f^{-1}</math> were differentiable at <math>b</math>, then, by the chain rule, <math>1 = (f^{-1} \circ f)'(a) = (f^{-1})'(b)f'(a)</math>, which implies <math>f'(a) \ne 0</math>. (The situation is different for holomorphic functions; see [[#Holomorphic inverse function theorem]] below.) For functions of more than one variable, the theorem states that if <math>f</math> is a continuously differentiable function from an open subset <math>A</math> of <math>\mathbb{R}^n</math> into <math>\R^n</math>, and the [[total derivative|derivative]] <math>f'(a)</math> is invertible at a point {{Mvar|a}} (that is, the determinant of the [[Jacobian matrix and determinant|Jacobian matrix]] of {{Mvar|f}} at {{Mvar|a}} is non-zero), then there exist neighborhoods <math>U</math> of <math>a</math> in <math>A</math> and <math>V</math> of <math>b = f(a)</math> such that <math>f(U) \subset V</math> and <math>f : U \to V</math> is bijective.<ref name="Hörmander">Theorem 1.1.7. in {{cite book|title=The Analysis of Linear Partial Differential Operators I: Distribution Theory and Fourier Analysis|series=Classics in Mathematics|first=Lars|last= Hörmander|author-link=Lars Hörmander|publisher=Springer|year= 2015|edition=2nd| isbn= 978-3-642-61497-2}}</ref> Writing <math>f=(f_1,\ldots,f_n)</math>, this means that the system of {{Mvar|n}} equations <math>y_i = f_i(x_1, \dots, x_n)</math> has a unique solution for <math>x_1, \dots, x_n</math> in terms of <math>y_1, \dots, y_n</math> when <math>x \in U, y \in V</math>. Note that the theorem ''does not'' say <math>f</math> is bijective onto the image where <math>f'</math> is invertible but that it is locally bijective where <math>f'</math> is invertible. Moreover, the theorem says that the inverse function <math>f^{-1} : V \to U</math> is continuously differentiable, and its derivative at <math>b=f(a)</math> is the inverse map of <math>f'(a)</math>; i.e., :<math>(f^{-1})'(b) = f'(a)^{-1}.</math> In other words, if <math>Jf^{-1}(b), Jf(a)</math> are the Jacobian matrices representing <math>(f^{-1})'(b), f'(a)</math>, this means: :<math>Jf^{-1}(b) = Jf(a)^{-1}.</math> The hard part of the theorem is the existence and differentiability of <math>f^{-1}</math>. Assuming this, the inverse derivative formula follows from the [[chain rule]] applied to <math>f^{-1}\circ f = I</math>. (Indeed, <math>1=I'(a) = (f^{-1} \circ f)'(a) = (f^{-1})'(b) \circ f'(a).</math>) Since taking the inverse is infinitely differentiable, the formula for the derivative of the inverse shows that if <math>f</math> is continuously <math>k</math> times differentiable, with invertible derivative at the point {{Mvar|a}}, then the inverse is also continuously <math>k</math> times differentiable. Here <math>k</math> is a positive integer or <math>\infty</math>. There are two variants of the inverse function theorem.<ref name="Hörmander" /> Given a continuously differentiable map <math>f : U \to \mathbb{R}^m</math>, the first is *The derivative <math>f'(a)</math> is surjective (i.e., the Jacobian matrix representing it has rank <math>m</math>) if and only if there exists a continuously differentiable function <math>g</math> on a neighborhood <math>V</math> of <math>b = f(a)</math> such that <math>f \circ g = I</math> near <math>b</math>, and the second is *The derivative <math>f'(a)</math> is injective if and only if there exists a continuously differentiable function <math>g</math> on a neighborhood <math>V</math> of <math>b = f(a)</math> such that <math>g \circ f = I</math> near <math>a</math>. In the first case (when <math>f'(a)</math> is surjective), the point <math>b = f(a)</math> is called a [[regular value]]. Since <math>m = \dim \ker(f'(a)) + \dim \operatorname{im}(f'(a))</math>, the first case is equivalent to saying <math>b = f(a)</math> is not in the image of [[Critical point (mathematics)#Critical point of a differentiable map|critical points]] <math>a</math> (a critical point is a point <math>a</math> such that the kernel of <math>f'(a)</math> is nonzero). The statement in the first case is a special case of the [[submersion theorem]]. These variants are restatements of the inverse functions theorem. Indeed, in the first case when <math>f'(a)</math> is surjective, we can find an (injective) linear map <math>T</math> such that <math>f'(a) \circ T = I</math>. Define <math>h(x) = a + Tx</math> so that we have: :<math>(f \circ h)'(0) = f'(a) \circ T = I.</math> Thus, by the inverse function theorem, <math>f \circ h</math> has inverse near <math>0</math>; i.e., <math>f \circ h \circ (f \circ h)^{-1} = I</math> near <math>b</math>. The second case (<math>f'(a)</math> is injective) is seen in the similar way.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)