Editing Symmetry of second derivatives (section)

== Schwarz's theorem<!--'Schwarz's theorem' and 'Clairaut's theorem on equality of mixed partials' redirect here--> ==
{{redirect|Schwarz's theorem|the result in complex analysis|Schwarz lemma}}
In [[mathematical analysis]], '''Schwarz's theorem'''<!--boldface per WP:R#PLA--> (or '''Clairaut's theorem on equality of mixed partials'''<!--boldface per WP:R#PLA-->){{sfn|James|1966|p={{pn|date=August 2021}}}} named after [[Alexis Clairaut]] and [[Hermann Schwarz]], states that for a function <math>f \colon \Omega \to \mathbb{R}</math> defined on a set <math>\Omega \subset \mathbb{R}^n</math>, if <math>\mathbf{p}\in \mathbb{R}^n</math> is a point such that some [[Neighbourhood (mathematics)|neighborhood]] of <math>\mathbf{p}</math> is contained in <math>\Omega</math> and <math>f</math> has [[continuous function|continuous]] second [[partial derivatives]] on that neighborhood of <math>\mathbf{p}</math>, then for all {{mvar|i}} and {{mvar|j}} in <math>\{1, 2 \ldots,\, n\},</math>

:<math>
  \frac{\partial^2}{\partial x_i\, \partial x_j}f(\mathbf{p}) =
  \frac{\partial^2}{\partial x_j\, \partial x_i}f(\mathbf{p}).
</math>

The partial derivatives of this function commute at that point.

[[#Sufficiency of twice-differentiability|There exists]] a version of this theorem where <math>f</math> is only required to be twice differentiable at the point <math>\mathbf{p}</math>.

One easy way to establish this theorem (in the case where <math>n = 2</math>, <math>i = 1</math>, and <math>j = 2</math>, which readily entails the result in general) is by applying [[Green's theorem]] to the [[gradient]] of <math>f.</math>

An elementary proof for functions on open subsets of the plane is as follows (by a simple reduction, the general case for the theorem of Schwarz easily reduces to the planar case).<ref name=burkhill>{{harvnb|Burkill|1962|pages=154–155}}</ref> Let <math>f(x,y)</math> be a [[differentiable function]] on an open rectangle <math>\Omega</math> containing a point <math>(a,b)</math> and suppose that <math>df</math> is continuous with continuous <math>\partial_x \partial _y f</math> and <math>\partial_y\partial_x f</math> over <math>\Omega.</math> Define

:<math>\begin{align}
  u\left(h,\, k\right) &= f\left(a + h,\, b + k\right) - f\left(a + h,\, b\right), \\
  v\left(h,\, k\right) &= f\left(a + h,\, b + k\right) - f\left(a,\, b + k\right), \\
  w\left(h,\, k\right) &= f\left(a + h,\, b + k\right) - f\left(a + h,\, b\right) - f\left(a,\, b + k\right) + f\left(a,\, b\right).
\end{align}</math>

These functions are defined for <math>\left|h\right|,\, \left|k\right| < \varepsilon</math>, where <math>\varepsilon > 0 </math> and <math>\left[a - \varepsilon,\, a + \varepsilon\right] \times \left[b - \varepsilon,\, b + \varepsilon\right]</math> is contained in <math>\Omega.</math>

By the [[mean value theorem]], for fixed {{mvar|h}} and {{mvar|k}} non-zero,  <math>\theta, \theta',  \phi, \phi'</math> can be found in the open interval <math> (0,1)</math> with

:<math>\begin{align}
     w\left(h,\, k\right)
  &= u\left(h,\, k\right) - u\left(0,\, k\right) = h\, \partial_x u\left(\theta h,\, k\right) \\
  &= h\,\left[\partial_x f\left(a + \theta h,\, b + k\right) - \partial_x f\left(a + \theta h,\, b\right)\right] \\
  &= hk \, \partial_y \partial_x f\left(a + \theta h,\, b + \theta^\prime k\right) \\

     w\left(h,\, k\right)
  &= v\left(h,\, k\right) - v\left(h,\, 0\right) = k\,\partial_y v\left(h,\, \phi k\right) \\
  &= k\left[\partial_y f\left(a + h,\, b + \phi k\right) - \partial_y f\left(a,\, b + \phi k\right)\right] \\
  &= hk\, \partial_x\partial_y f \left(a + \phi^\prime h,\, b + \phi k\right).
\end{align}</math>

Since <math>h,\,k \neq 0</math>, the first equality below can be divided by <math>hk</math>:

:<math>\begin{align}
  hk\,\partial_y\partial_x f\left(a + \theta h,\, b + \theta^\prime k\right) &=
    hk \, \partial_x\partial_y f\left(a + \phi^\prime h,\, b + \phi k\right), \\
 \partial_y\partial_x f\left(a + \theta h,\, b + \theta^\prime k\right) &=
    \partial_x\partial_y f\left(a + \phi^\prime h,\, b + \phi k\right).
\end{align}</math>

Letting <math>h,\,k</math> tend to zero in the last equality, the continuity assumptions on <math>\partial_y\partial_x f</math> and <math>\partial_x\partial_y f</math> now imply that

:<math>
  \frac{\partial^2}{\partial x\partial y}f\left(a,\, b\right) =
  \frac{\partial^2}{\partial y\partial x}f\left(a,\, b\right).
</math>

This account is a straightforward classical method found in many text books, for example in Burkill, Apostol and Rudin.<ref name=burkhill/>{{sfn|Apostol|1965}}{{sfn|Rudin|1976}}
 
Although the derivation above is elementary, the approach can also be viewed from a more conceptual perspective so that the result becomes more apparent.{{sfn|Hörmander|2015|pages=7, 11|ps=. This condensed account is possibly the shortest.}}{{sfn|Dieudonné|1960|pages=179–180}}{{sfn|Godement|1998b|pages=287–289}}{{sfn|Lang|1969|pages=108–111}}{{sfn|Cartan|1971|pages=64–67}} Indeed the [[difference operator]]s <math>\Delta^t_x,\,\,\Delta^t_y</math> commute and <math>\Delta^t_x f,\,\,\Delta^t_y f</math> tend to <math>\partial_x f,\,\, \partial_y f</math> as <math>t</math> tends to 0, with a similar statement for second order operators.{{efn|name="Schwartz"|1=These can also be rephrased in terms of the action of operators on [[Schwartz function]]s on the plane. Under [[Fourier transform]], the difference and differential operators are just multiplication operators.{{sfn|Hörmander|2015|loc=Chapter VII}}}} Here, for <math>z</math> a vector in the plane and <math>u</math> a directional vector <math>\tbinom{1}{0}</math> or <math>\tbinom{0}{1}</math>, the difference operator is defined by

:<math>\Delta^t_u f(z)= {f(z+tu) - f(z)\over t}.</math>

By the [[fundamental theorem of calculus]] for <math>C^1</math> functions <math>f</math> on an open interval <math>I</math> with <math> (a,b) \subset I</math>

:<math>\int_a^b f^\prime (x) \, dx = f(b) - f(a).</math>

Hence

:<math>|f(b) - f(a)| \le (b-a)\, \sup_{c\in (a,b)} |f^\prime(c)|</math>.

This is a generalized version of the [[mean value theorem]]. Recall that the elementary discussion on maxima or minima for real-valued functions implies that if <math>f</math> is continuous on <math>[a,b]</math> and differentiable on <math>(a,b)</math>, then there is a point <math>c</math> in <math>(a,b)</math> such that

:<math> {f(b) - f(a) \over b - a} = f^\prime(c).</math>

For vector-valued functions with <math>V</math> a finite-dimensional normed space, there is no analogue of the equality above, indeed it fails. But since <math> \inf f^\prime \le f^\prime(c) \le \sup f^\prime</math>, the inequality above is a useful substitute. Moreover, using the pairing of the dual of <math>V</math> with its dual norm, yields the following inequality:

:<math>\|f(b) - f(a)\| \le (b-a)\, \sup_{c\in (a,b)} \|f^\prime(c)\|</math>.

These versions of the mean valued theorem are discussed in Rudin, Hörmander and elsewhere.{{sfn|Hörmander|2015|page=6}}{{sfn|Rudin|1976|page={{pn|date=August 2021}}}}

For <math>f</math> a <math>C^2</math> function on an open set in the plane, define <math>D_1 = \partial_x</math> and <math> D_2 = \partial_y</math>. Furthermore for <math> t \ne 0</math> set

:<math>\Delta_1^t f(x,y) = [f(x+t,y)-f(x,y)]/t,\,\,\,\,\,\,\Delta^t_2f(x,y)=[f(x,y+t) -f(x,y)]/t</math>.

Then for <math>(x_0,y_0)</math> in the open set, the generalized mean value theorem can be applied twice:

:<math> \left|\Delta_1^t\Delta_2^t f(x_0,y_0) - D_1 D_2f(x_0,y_0)\right|\le \sup_{0\le s \le 1} \left|\Delta_1^t D_2 f(x_0,y_0 + ts)  -D_1D_2 f(x_0,y_0)\right|\le \sup_{0\le r,s\le 1} \left|D_1D_2f(x_0+tr,y_0+ts) - D_1D_2f(x_0,y_0)\right|.</math>

Thus <math>\Delta_1^t\Delta_2^t f(x_0,y_0)</math> tends to <math>D_1 D_2f(x_0,y_0)</math> as <math>t</math> tends to 0. The same argument shows that <math>\Delta_2^t\Delta_1^t f(x_0,y_0)</math> tends to <math>D_2 D_1f(x_0,y_0)</math>. Hence, since the difference operators commute, so do the partial differential operators <math>D_1</math> and <math>D_2</math>, as claimed.{{sfn|Hörmander|2015|page=11}}{{sfn|Dieudonné|1960}}{{sfn|Godement|1998a}}{{sfn|Lang|1969}}{{sfn|Cartan|1971}}

'''Remark.''' By two applications of the classical mean value theorem,

:<math>\Delta_1^t\Delta_2^t f(x_0,y_0)= D_1 D_2 f(x_0+t\theta,y_0 +t\theta^\prime)</math>

for some <math>\theta</math> and <math>\theta^\prime</math> in <math>(0,1)</math>. Thus the first elementary proof can be reinterpreted using difference operators. Conversely, instead of using the generalized mean value theorem in the second proof, the classical mean valued theorem could be used.