Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Inverse function theorem
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Methods of proof== As an important result, the inverse function theorem has been given numerous proofs. The proof most commonly seen in textbooks relies on the [[contraction mapping]] principle, also known as the [[Banach fixed-point theorem]] (which can also be used as the key step in the proof of [[Picard–Lindelöf theorem|existence and uniqueness]] of solutions to [[ordinary differential equations]]).<ref>{{cite book |first=Robert C. |last=McOwen |title=Partial Differential Equations: Methods and Applications |location=Upper Saddle River, NJ |publisher=Prentice Hall |year=1996 |isbn=0-13-121880-8 |pages=218–224 |chapter=Calculus of Maps between Banach Spaces |chapter-url=https://books.google.com/books?id=TuNHsNC1Yf0C&pg=PA218 }}</ref><ref>{{Cite web |url=https://terrytao.wordpress.com/2011/09/12/the-inverse-function-theorem-for-everywhere-differentiable-maps/ |first=Terence |last=Tao |author-link=Terence Tao |title=The inverse function theorem for everywhere differentiable maps |date=September 12, 2011 |access-date=2019-07-26 }}</ref> Since the fixed point theorem applies in infinite-dimensional (Banach space) settings, this proof generalizes immediately to the infinite-dimensional version of the inverse function theorem<ref>{{Cite web|url=https://r-grande.github.io/Expository/Inverse%20Function%20Theorem.pdf |title=Inverse Function Theorem|last=Jaffe|first=Ethan}}</ref> (see [[Inverse function theorem#Generalizations|Generalizations]] below). An alternate proof in finite dimensions hinges on the [[extreme value theorem]] for functions on a [[compact set]].<ref name="spivak_manifolds">{{harvnb|Spivak|1965|loc=pages 31–35 }}</ref> This approach has an advantage that the proof generalizes to a situation where there is no Cauchy completeness (see {{section link||Over_a_real_closed_field}}). Yet another proof uses [[Newton's method]], which has the advantage of providing an [[effective method|effective version]] of the theorem: bounds on the derivative of the function imply an estimate of the size of the neighborhood on which the function is invertible.<ref name="hubbard_hubbard">{{cite book |first1=John H. |last1=Hubbard |author-link=John H. Hubbard |first2=Barbara Burke |last2=Hubbard|author2-link=Barbara Burke Hubbard |title=Vector Analysis, Linear Algebra, and Differential Forms: A Unified Approach |edition=Matrix |year=2001 }}</ref> === Proof for single-variable functions === We want to prove the following: ''Let <math>D \subseteq \R</math> be an open set with <math>x_0 \in D, f: D \to \R</math> a continuously differentiable function defined on <math>D</math>, and suppose that <math>f'(x_0) \ne 0</math>. Then there exists an open interval <math>I</math> with <math>x_0 \in I</math> such that <math>f</math> maps <math>I</math> bijectively onto the open interval <math>J = f(I)</math>, and such that the inverse function <math>f^{-1} : J \to I</math> is continuously differentiable, and for any <math>y \in J</math>, if <math>x \in I</math> is such that <math>f(x) = y</math>, then <math>(f^{-1})'(y) = \dfrac{1}{f'(x)}</math>.'' We may without loss of generality assume that <math>f'(x_0) > 0</math>. Given that <math>D</math> is an open set and <math>f'</math> is continuous at <math>x_0</math>, there exists <math>r > 0</math> such that <math>(x_0 - r, x_0 + r) \subseteq D</math> and<math display="block">|f'(x) - f'(x_0)| < \dfrac{f'(x_0)}{2} \qquad \text{for all } |x - x_0| < r.</math> In particular,<math display="block">f'(x) > \dfrac{f'(x_0)}{2} >0 \qquad \text{for all } |x - x_0| < r.</math> This shows that <math>f</math> is strictly increasing for all <math>|x - x_0| < r</math>. Let <math>\delta > 0</math> be such that <math>\delta < r</math>. Then <math>[x - \delta, x + \delta] \subseteq (x_0 - r, x_0 + r)</math>. By the intermediate value theorem, we find that <math>f</math> maps the interval <math>[x - \delta, x + \delta]</math> bijectively onto <math>[f(x - \delta), f(x + \delta)]</math>. Denote by <math>I = (x-\delta, x+\delta)</math> and <math>J = (f(x - \delta),f(x + \delta))</math>. Then <math>f: I \to J</math> is a bijection and the inverse <math>f^{-1}: J \to I</math> exists. The fact that <math>f^{-1}: J \to I</math> is differentiable follows from the differentiability of <math>f</math>. In particular, the result follows from the fact that if <math>f: I \to \R</math> is a strictly monotonic and continuous function that is differentiable at <math>x_0 \in I</math> with <math>f'(x_0) \ne 0</math>, then <math>f^{-1}: f(I) \to \R</math> is differentiable with <math>(f^{-1})'(y_0) = \dfrac{1}{f'(y_0)}</math>, where <math>y_0 = f(x_0)</math> (a standard result in analysis). This completes the proof. === A proof using successive approximation === To prove existence, it can be assumed after an affine transformation that <math>f(0)=0</math> and <math>f^\prime(0)=I</math>, so that <math> a=b=0</math>. By the [[Mean value theorem#Mean value theorem for vector-valued functions|mean value theorem for vector-valued functions]], for a differentiable function <math>u:[0,1]\to\mathbb R^m</math>, <math display="inline">\|u(1)-u(0)\|\le \sup_{0\le t\le 1} \|u^\prime(t)\|</math>. Setting <math>u(t)=f(x+t(x^\prime -x)) - x-t(x^\prime-x)</math>, it follows that :<math>\|f(x) - f(x^\prime) - x + x^\prime\| \le \|x -x^\prime\|\,\sup_{0\le t \le 1} \|f^\prime(x+t(x^\prime -x))-I\|.</math> Now choose <math>\delta>0</math> so that <math display="inline">\|f'(x) - I\| < {1\over 2}</math> for <math>\|x\|< \delta</math>. Suppose that <math>\|y\|<\delta/2</math> and define <math>x_n</math> inductively by <math>x_0=0</math> and <math> x_{n+1}=x_n + y - f(x_n)</math>. The assumptions show that if <math> \|x\|, \,\, \|x^\prime\| < \delta</math> then :<math>\|f(x)-f(x^\prime) - x + x^\prime\| \le \|x-x^\prime\|/2</math>. In particular <math>f(x)=f(x^\prime)</math> implies <math>x=x^\prime</math>. In the inductive scheme <math>\|x_n\| <\delta</math> and <math>\|x_{n+1} - x_n\| < \delta/2^n</math>. Thus <math>(x_n)</math> is a [[Cauchy sequence]] tending to <math>x</math>. By construction <math>f(x)=y</math> as required. To check that <math>g=f^{-1}</math> is C<sup>1</sup>, write <math>g(y+k) = x+h</math> so that <math>f(x+h)=f(x)+k</math>. By the inequalities above, <math>\|h-k\| <\|h\|/2</math> so that <math>\|h\|/2<\|k\| < 2\|h\|</math>. On the other hand, if <math>A=f^\prime(x)</math>, then <math>\|A-I\|<1/2</math>. Using the [[geometric series]] for <math>B=I-A</math>, it follows that <math>\|A^{-1}\| < 2</math>. But then :<math> {\|g(y+k) -g(y) - f^\prime(g(y))^{-1}k \| \over \|k\|} = {\|h -f^\prime(x)^{-1}[f(x+h)-f(x)]\| \over \|k\|} \le 4 {\|f(x+h) - f(x) -f^\prime(x)h\|\over \|h\|} </math> tends to 0 as <math>k</math> and <math>h</math> tend to 0, proving that <math>g</math> is C<sup>1</sup> with <math>g^\prime(y)=f^\prime(g(y))^{-1}</math>. The proof above is presented for a finite-dimensional space, but applies equally well for [[Banach space]]s. If an invertible function <math>f</math> is C<sup>k</sup> with <math>k>1</math>, then so too is its inverse. This follows by induction using the fact that the map <math>F(A)=A^{-1}</math> on operators is C<sup>k</sup> for any <math>k</math> (in the finite-dimensional case this is an elementary fact because the inverse of a matrix is given as the [[adjugate matrix]] divided by its [[determinant]]). <ref name="Hörmander" /><ref>{{cite book|title=Calcul Differentiel|language=fr|first=Henri|last= Cartan|author-link= Henri Cartan|publisher=[[Éditions Hermann|Hermann]]|year= 1971|isbn=978-0-395-12033-0 |pages=55–61}}</ref> The method of proof here can be found in the books of [[Henri Cartan]], [[Jean Dieudonné]], [[Serge Lang]], [[Roger Godement]] and [[Lars Hörmander]]. === A proof using the contraction mapping principle === Here is a proof based on the [[contraction mapping theorem]]. Specifically, following T. Tao,<ref>Theorem 17.7.2 in {{cite book|mr=3310023|last1=Tao|first1=Terence|title=Analysis. II|edition=Third edition of 2006 original|series=Texts and Readings in Mathematics|volume=38|publisher=Hindustan Book Agency|location=New Delhi|year=2014|isbn=978-93-80250-65-6|zbl=1300.26003}}</ref> it uses the following consequence of the contraction mapping theorem. {{math_theorem|name=Lemma|math_statement=Let <math>B(0, r)</math> denote an open ball of radius ''r'' in <math>\mathbb{R}^n</math> with center 0 and <math>g : B(0, r) \to \mathbb{R}^n</math> a map with a constant <math>0 < c < 1</math> such that :<math>|g(y) - g(x)| \le c|y-x|</math> for all <math>x, y</math> in <math>B(0, r)</math>. Then for <math>f = I + g</math> on <math>B(0, r)</math>, we have :<math>(1-c)|x - y| \le |f(x) - f(y)|,</math> in particular, ''f'' is injective. If, moreover, <math>g(0) = 0</math>, then :<math>B(0, (1-c)r) \subset f(B(0, r)) \subset B(0, (1+c)r)</math>. More generally, the statement remains true if <math>\mathbb{R}^n</math> is replaced by a Banach space. Also, the first part of the lemma is true for any normed space.}} Basically, the lemma says that a small perturbation of the identity map by a contraction map is injective and preserves a ball in some sense. Assuming the lemma for a moment, we prove the theorem first. As in the above proof, it is enough to prove the special case when <math>a = 0, b = f(a) = 0</math> and <math>f'(0) = I</math>. Let <math>g = f - I</math>. The [[mean value inequality]] applied to <math>t \mapsto g(x + t(y - x))</math> says: :<math>|g(y) - g(x)| \le |y-x|\sup_{0 < t < 1} |g'(x + t(y - x))|.</math> Since <math>g'(0) = I - I = 0</math> and <math>g'</math> is continuous, we can find an <math>r > 0</math> such that :<math>|g(y) - g(x)| \le 2^{-1}|y-x|</math> for all <math>x, y</math> in <math>B(0, r)</math>. Then the early lemma says that <math>f = g + I</math> is injective on <math>B(0, r)</math> and <math>B(0, r/2) \subset f(B(0, r))</math>. Then :<math>f : U = B(0, r) \cap f^{-1}(B(0, r/2)) \to V = B(0, r/2)</math> is bijective and thus has an inverse. Next, we show the inverse <math>f^{-1}</math> is continuously differentiable (this part of the argument is the same as that in the previous proof). This time, let <math>g = f^{-1}</math> denote the inverse of <math>f</math> and <math>A = f'(x)</math>. For <math>x = g(y)</math>, we write <math>g(y + k) = x + h</math> or <math>y + k = f(x+h)</math>. Now, by the early estimate, we have :<math>|h - k| = |f(x+h) - f(x) - h| \le |h|/2</math> and so <math>|h|/2 \le |k|</math>. Writing <math>\| \cdot \|</math> for the operator norm, :<math>|g(y+k) - g(y) - A^{-1} k| = |h - A^{-1}(f(x + h) - f(x))| \le \|A^{-1}\||Ah - f(x+h) + f(x)|.</math> As <math>k \to 0</math>, we have <math>h \to 0</math> and <math>|h|/|k|</math> is bounded. Hence, <math>g</math> is differentiable at <math>y</math> with the derivative <math>g'(y) = f'(g(y))^{-1}</math>. Also, <math>g'</math> is the same as the composition <math>\iota \circ f' \circ g</math> where <math>\iota : T \mapsto T^{-1}</math>; so <math>g'</math> is continuous. It remains to show the lemma. First, we have: :<math>|x - y| - |f(x) - f(y)| \le |g(x) - g(y)| \le c|x - y|,</math> which is to say :<math>(1 - c)|x - y| \le |f(x) - f(y)|.</math> This proves the first part. Next, we show <math>f(B(0, r)) \supset B(0, (1-c)r)</math>. The idea is to note that this is equivalent to, given a point <math>y</math> in <math>B(0, (1-c) r)</math>, find a fixed point of the map :<math>F : \overline{B}(0, r') \to \overline{B}(0, r'), \, x \mapsto y - g(x)</math> where <math>0 < r' < r</math> such that <math>|y| \le (1-c)r'</math> and the bar means a closed ball. To find a fixed point, we use the contraction mapping theorem and checking that <math>F</math> is a well-defined strict-contraction mapping is straightforward. Finally, we have: <math>f(B(0, r)) \subset B(0, (1+c)r)</math> since :<math>|f(x)| = |x + g(x) - g(0)| \le (1+c)|x|. \square</math> As might be clear, this proof is not substantially different from the previous one, as the proof of the contraction mapping theorem is by successive approximation.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)