Editing Covariant derivative

{{Short description|Specification of a derivative along a tangent vector of a manifold}}
{{Use American English|date=March 2019}}
{{About||directional tensor derivatives in continuum mechanics|Tensor derivative (continuum mechanics)|the covariant derivative used in gauge theories|Gauge covariant derivative}}

In [[mathematics]], the '''covariant derivative''' is a way of specifying a [[derivative]] along [[tangent vector]]s of a [[manifold]]. Alternatively, the covariant derivative is a way of introducing and working with a [[connection (mathematics)|connection]] on a manifold by means of a [[differential operator]], to be contrasted with the approach given by a [[connection (principal bundle)|principal connection]] on the [[frame bundle]] – see [[affine connection]]. In the special case of a manifold [[isometry|isometrically]] embedded into a higher-dimensional [[Euclidean space]], the covariant derivative can be viewed as the [[orthogonal projection]] of the Euclidean [[directional derivative]] onto the manifold's tangent space. In this case the Euclidean derivative is broken into two parts, the extrinsic normal component (dependent on the embedding) and the intrinsic covariant derivative component.

The name is motivated by the importance of [[general covariance|changes of coordinate]] in [[physics]]: the covariant derivative transforms [[Covariant transformation|covariantly]] under a general coordinate transformation, that is, linearly via the [[Jacobian matrix and determinant|Jacobian matrix]] of the transformation.<ref>{{cite book|last=Einstein|first=Albert|title=The Meaning of Relativity|url=https://archive.org/details/meaningofrelativ00eins_0|chapter=The General Theory of Relativity | year=1922}}</ref>

This article presents an introduction to the covariant derivative of a [[vector field]] with respect to a vector field, both in a coordinate-free language and using a local [[coordinate system]] and the traditional index notation. The covariant derivative of a [[tensor field]] is presented as an extension of the same concept. The covariant derivative generalizes straightforwardly to a notion of differentiation associated to a [[connection on a vector bundle]], also known as a '''Koszul connection'''.

==History==
Historically, at the turn of the 20th century, the covariant derivative was introduced by [[Gregorio Ricci-Curbastro]] and [[Tullio Levi-Civita]] in the theory of [[Riemannian geometry|Riemannian]] and [[pseudo-Riemannian manifold|pseudo-Riemannian geometry]].<ref>{{cite journal |last2=Levi-Civita |first2=T. |last1=Ricci |first1=G. |title=Méthodes de calcul différential absolu et leurs applications |url=http://gdz.sub.uni-goettingen.de/dms/load/img/?PID=GDZPPN002258102 |journal=Mathematische Annalen |volume=54 |year=1901 |issue=1–2 |pages=125–201 |doi= 10.1007/bf01454201|s2cid=120009332 }}</ref> Ricci and Levi-Civita (following ideas of [[Elwin Bruno Christoffel]]) observed that the [[Christoffel symbols]] used to define the [[Riemann tensor|curvature]] could also provide a notion of [[derivative|differentiation]] which generalized the classical [[directional derivative]] of [[vector fields]] on a manifold.<ref>{{cite book |last=Riemann |first=G. F. B. |chapter=Über die Hypothesen, welche der Geometrie zu Grunde liegen |title=Gesammelte Mathematische Werke |year=1866 }}; reprint, ed. Weber, H. (1953), New York: Dover.</ref><ref>{{cite journal |last=Christoffel |first=E. B. |title=Über die Transformation der homogenen Differentialausdrücke zweiten Grades |journal=[[Crelle's Journal|Journal für die reine und angewandte Mathematik]] |volume=70 |year=1869 |pages=46–70 |url=https://eudml.org/doc/148073 }}</ref> This new derivative – the [[Levi-Civita connection]] – was ''[[Covariance and contravariance of vectors|covariant]]'' in the sense that it satisfied Riemann's requirement that objects in geometry should be independent of their description in a particular coordinate system.

It was soon noted by other mathematicians, prominent among these being [[Hermann Weyl]], [[Jan Arnoldus Schouten]], and [[Élie Cartan]],<ref>cf. with {{cite journal |last=Cartan |first=É |url=http://www.numdam.org/item?id=ASENS_1923_3_40__325_0 |title=Sur les variétés à connexion affine et la theorie de la relativité généralisée |journal= Annales Scientifiques de l'École Normale Supérieure|volume=40 |year=1923 |pages=325–412 |doi=10.24033/asens.751 |doi-access=free }}</ref> that a covariant derivative could be defined abstractly without the presence of a [[metric tensor|metric]]. The crucial feature was not a particular dependence on the metric, but that the Christoffel symbols satisfied a certain precise second-order transformation law. This transformation law could serve as a starting point for defining the derivative in a covariant manner. Thus the theory of covariant differentiation forked off from the strictly Riemannian context to include a wider range of possible geometries.

In the 1940s, practitioners of [[differential geometry]] began introducing other notions of covariant differentiation in general [[vector bundle]]s which were, in contrast to the classical bundles of interest to geometers, not part of the [[tensor analysis]] of the manifold. By and large, these generalized covariant derivatives had to be specified ''ad hoc'' by some version of the connection concept. In 1950, [[Jean-Louis Koszul]] unified these new ideas of covariant differentiation in a vector bundle by means of what is known today as a [[Koszul connection]] or a connection on a vector bundle.<ref>{{cite journal |last=Koszul |first=J. L. |title=Homologie et cohomologie des algebres de Lie |journal=Bulletin de la Société Mathématique de France |volume=78 |year=1950 |pages=65–127 |doi=10.24033/bsmf.1410 |doi-access=free }}</ref> Using ideas from [[Lie algebra cohomology]], Koszul successfully converted many of the analytic features of covariant differentiation into algebraic ones. In particular, Koszul connections eliminated the need for awkward manipulations of [[Christoffel symbols]] (and other analogous non-[[tensor]]ial objects) in differential geometry. Thus they quickly supplanted the classical notion of covariant derivative in many post-1950 treatments of the subject.

==Motivation==
[[File:Ковариантная производная c.jpg|220x124px|thumb|right]]
The '''covariant derivative''' is a generalization of the [[directional derivative]] from [[vector calculus]]. As with the directional derivative, the covariant derivative is a rule, <math>\nabla_{\mathbf u}{\mathbf v}</math>, which takes as its inputs: (1) a vector, {{math|'''u'''}}, defined at a point {{mvar|P}}, and (2) a [[vector field]] {{math|'''v'''}} defined in a [[neighborhood (mathematics)|neighborhood]] of {{mvar|P}}.<ref>The covariant derivative is also denoted variously by '''∂{{sub|v}}u''', '''D{{sub|v}}u''', or other notations.</ref> The output is the vector <math>\nabla_{\mathbf u}{\mathbf v}(P)</math>, also at the point {{mvar|P}}. The primary difference from the usual directional derivative is that <math>\nabla_{\mathbf u}{\mathbf v}</math> must, in a certain precise sense, be ''independent'' of the manner in which it is expressed in a [[coordinate system]].

A vector may be ''described'' as a list of numbers in terms of a [[basis (mathematics)|basis]], but as a geometrical object the vector retains its identity regardless of how it is described. For a geometric vector written in components with respect to one basis, when the basis is changed the components transform according to a [[change of basis]] formula, with the coordinates undergoing a [[covariant transformation]]. The covariant derivative is required to transform, under a change in coordinates, by a covariant transformation in the same way as a basis does (hence the name).

In the case of [[Euclidean space]], one usually defines the directional derivative of a vector field in terms of the difference between two vectors at two nearby points.
In such a system one [[Translation (geometry)|translates]] one of the vectors to the origin of the other, keeping it parallel, then takes their difference within the same vector space. With a Cartesian (fixed [[orthonormal]]) coordinate system "keeping it parallel" amounts to keeping the components constant. This ordinary directional derivative on Euclidean space is the first example of a covariant derivative.

Next, one must take into account changes of the coordinate system. For example, if the Euclidean plane is described by polar coordinates, "keeping it parallel" does ''not'' amount to keeping the polar components constant under translation, since the coordinate grid itself "rotates". Thus, the same covariant derivative written in [[coordinates (elementary mathematics)|polar coordinates]] contains extra terms that describe how the coordinate grid itself rotates, or how in more general coordinates the grid expands, contracts, twists, interweaves, etc.

Consider the example of a particle moving along a curve {{math|''γ''(''t'')}} in the Euclidean plane. In polar coordinates, {{mvar|γ}} may be written in terms of its radial and angular coordinates by {{math|1=''γ''(''t'') = (''r''(''t''), ''θ''(''t''))}}. A vector at a particular time {{mvar|t}}<ref>In many applications, it may be better not to think of {{mvar|t}} as corresponding to time, at least for applications in [[general relativity]]. It is simply regarded as an abstract parameter varying smoothly and monotonically along the path.</ref> (for instance, a constant acceleration of the particle) is expressed in terms of <math>(\mathbf{e}_r, \mathbf{e}_{\theta})</math>, where <math>\mathbf{e}_r</math> and <math>\mathbf{e}_{\theta}</math> are unit tangent vectors for the polar coordinates, serving as a basis to decompose a vector in terms of radial and [[tangential component]]s. At a slightly later time, the new basis in polar coordinates appears slightly rotated with respect to the first set. The covariant derivative of the basis vectors (the [[Christoffel symbols]]) serve to express this change.
{{Clear}}

In a curved space, such as the surface of the Earth (regarded as a sphere), the [[Translation (geometry)|translation]] of tangent vectors between different points is not well defined, and its analog, [[parallel transport]], depends on the path along which the vector is translated. A vector on a globe on the equator at point {{mvar|Q}} is directed to the north. Suppose we transport the vector (keeping it parallel) first along the equator to the point {{mvar|P}}, then drag it along a meridian to the {{mvar|N}} pole, and finally transport it along another meridian back to {{mvar|Q}}. Then we notice that the parallel-transported vector along a closed circuit does not return as the same vector; instead, it has another orientation. This would not happen in Euclidean space and is caused by the ''curvature'' of the surface of the globe. The same effect occurs if we drag the vector along an infinitesimally small closed surface subsequently along two directions and then back. This infinitesimal change of the vector is a measure of the [[Curvature of Riemannian manifolds|curvature]], and can be defined in terms of the covariant derivative.{{Clear}}

===Remarks===
* The definition of the covariant derivative does not use the metric in space. However, for each metric there is a unique [[Torsion tensor|torsion]]-free covariant derivative called the [[Levi-Civita connection]] such that the covariant derivative of the metric is zero.
* The properties of a derivative imply that <math>\nabla_\mathbf{v} \mathbf{u}</math> depends on the values of {{mvar|u}} in a neighborhood of a point {{mvar|p}} in the same way as e.g. the derivative of a scalar function {{mvar|f}} along a curve at a given point {{mvar|p}} depends on the values of {{mvar|f}} in a neighborhood of {{mvar|p}}.
* The information in a neighborhood of a point {{mvar|p}} in the covariant derivative can be used to define [[parallel transport]] of a vector. Also the [[Curvature of Riemannian manifolds|curvature]], [[Torsion tensor|torsion]], and [[geodesic]]s may be defined only in terms of the covariant derivative or other related variation on the idea of a [[Connection (vector bundle)|linear connection]].

==Informal definition using an embedding into Euclidean space==
Suppose an open subset {{mvar|U}} of a {{mvar|d}}-dimensional [[Riemannian manifold]] {{mvar|M}} is embedded into Euclidean space <math>(\R^n, \langle\cdot, \cdot\rangle)</math> via a [[Smoothness#Differentiability classes|twice continuously-differentiable]] (C{{sup|2}}) mapping <math>\vec\Psi : \R^d \supset U \to \R^n</math> such that the tangent space at <math>\vec\Psi(p)</math> is spanned by the vectors
<math display="block">\left\{ \left. \frac{\partial\vec\Psi}{\partial x^i} \right|_p : i \in \{ 1, \dots, d\}\right\}</math>
and the scalar product <math>\left \langle \cdot, \cdot \right \rangle </math> on <math>\R^n</math> is compatible with the metric on {{mvar|M}}:
<math display="block">g_{ij} = \left\langle \frac{\partial\vec\Psi}{\partial x^i}, \frac{\partial\vec\Psi}{\partial x^j} \right\rangle.</math>

(Since the manifold metric is always assumed to be regular,{{Clarify|date=August 2024|reason=regular means what in this context?}} the compatibility condition implies linear independence of the partial derivative tangent vectors.)

For a tangent vector field, {{nowrap|<math>\vec V = v^j \frac{\partial \vec\Psi}{\partial x^j}</math>,}} one has
<math display="block">\frac{\partial\vec V}{\partial x^i} = \frac{\partial}{\partial x^i} \left( v^j \frac{\partial \vec\Psi}{\partial x^j} \right)= \frac{\partial v^j}{\partial x^i} \frac{\partial\vec \Psi}{\partial x^j} + v^j \frac{\partial^2 \vec\Psi}{\partial x^i \, \partial x^j} .</math>

The last term is not tangential to {{mvar|M}}, but can be expressed as a linear combination of the tangent space base vectors using the [[Christoffel symbols]] as linear factors plus a vector orthogonal to the tangent space:
<math display="block"> v^j \frac{\partial^2 \vec\Psi}{\partial x^i \, \partial x^j} = v^j {\Gamma^k}_{ij} \frac{\partial\vec\Psi}{\partial x^k} + \vec n . </math>

In the case of the [[Levi-Civita connection]], the covariant derivative <math>\nabla_{\mathbf{e}_i} \vec V</math>, also written {{nowrap|<math>\nabla_i \vec V</math>,}} is defined as the orthogonal projection of the usual derivative onto tangent space:
<math display="block">
\nabla_{\mathbf{e}_i} \vec V := \frac{\partial\vec V}{\partial x^i} - \vec n = \left( \frac{\partial v^k}{\partial x^i} + v^j {\Gamma^k}_{ij} \right) \frac{\partial\vec\Psi}{\partial x^k}.
</math>

From here it may be computationally convenient to obtain a relation between the Christoffel symbols for the Levi-Civita connection and the metric. To do this we first note that, since the vector <math>\vec n</math> in the previous equation is orthogonal to the tangent space,
<math display="block">
\left\langle \frac{\partial^2 \vec\Psi}{\partial x^i \, \partial x^j}, \frac{\partial\vec \Psi}{\partial x^l} \right\rangle
= \left\langle {\Gamma^k}_{ij} \frac{\partial\vec\Psi}{\partial x^k} + \vec n, \frac{\partial\vec \Psi}{\partial x^l} \right\rangle
= \left\langle \frac{\partial\vec\Psi}{\partial x^k}, \frac{\partial\vec\Psi}{\partial x^l} \right\rangle {\Gamma^k}_{ij}
= g_{kl} \, {\Gamma^k}_{ij} .
</math>

Then, since the partial derivative of a component <math>g_{ab}</math> of the metric with respect to a coordinate <math>x^c</math> is
<math display="block">
\frac{\partial g_{ab}}{\partial x^c} = \frac{\partial}{ \partial x^c} \left\langle \frac{\partial \vec\Psi}{ \partial x^a}, \frac{\partial \vec\Psi}{\partial x^b} \right\rangle = \left\langle \frac{\partial^2 \vec\Psi}{ \partial x^c \, \partial x^a}, \frac{\partial \vec\Psi}{\partial x^b} \right\rangle + \left\langle \frac{\partial \vec\Psi}{\partial x^a}, \frac{\partial^2 \vec\Psi}{ \partial x^c \, \partial x^b} \right\rangle,
</math>

any triplet {{nowrap|<math>i, j, k</math>}} of indices yields a system of equations
<math display="block">
\left\{
\begin{alignedat}{2}
\frac{\partial g_{jk}}{\partial x^i} = &    & \left\langle \frac{\partial \vec\Psi}{\partial x^j}, \frac{\partial^2 \vec\Psi}{\partial x^k \partial x^i} \right\rangle & + \left\langle \frac{\partial \vec\Psi}{\partial x^k}, \frac{\partial^2 \vec\Psi}{\partial x^i \partial x^j} \right\rangle \\
\frac{\partial g_{ki}}{\partial x^j} = & \left\langle \frac{\partial \vec\Psi}{\partial x^i}, \frac{\partial^2 \vec\Psi}{\partial x^j \partial x^k} \right\rangle &    & + \left\langle \frac{\partial \vec\Psi}{\partial x^k}, \frac{\partial^2 \vec\Psi}{\partial x^i \partial x^j} \right\rangle \\
\frac{\partial g_{ij}}{\partial x^k} = & \left\langle \frac{\partial \vec\Psi}{\partial x^i}, \frac{\partial^2 \vec\Psi}{\partial x^j \partial x^k} \right\rangle & + \left\langle \frac{\partial \vec\Psi}{\partial x^j}, \frac{\partial^2 \vec\Psi}{\partial x^k \partial x^i} \right\rangle &    & .
\end{alignedat}
\right.
</math>
(Here the symmetry of the scalar product has been used and the order of partial differentiations have been swapped.)

Adding the first two equations and subtracting the third, we obtain
<math display="block">
\frac{\partial g_{jk}}{\partial x^i} + \frac{\partial g_{ki}}{\partial x^j} - \frac{\partial g_{ij}}{\partial x^k} =
2\left\langle \frac{\partial\vec \Psi}{\partial x^k}, \frac{\partial^2 \vec\Psi}{\partial x^i \, \partial x^j} \right\rangle.
</math>

Thus the Christoffel symbols for the Levi-Civita connection are related to the metric by
<math display="block">
g_{kl} {\Gamma^k}_{ij} = \frac{1}{2} \left( \frac{\partial g_{jl}}{\partial x^i} + \frac{\partial g_{li}}{\partial x^j}- \frac{\partial g_{ij}}{\partial x^l}\right).
</math>

If {{mvar|g}} is nondegenerate then <math> {\Gamma^k}_{ij} </math> can be solved for directly as

<math display="block">
{\Gamma^k}_{ij} = \frac{1}{2} g^{kl} \left( \frac{\partial g_{jl}}{\partial x^i} + \frac{\partial g_{li}}{\partial x^j}- \frac{\partial g_{ij}}{\partial x^l}\right).
</math>

For a very simple example that captures the essence of the description above, draw a circle on a flat sheet of paper. Travel around the circle at a constant speed. The derivative of your velocity, your acceleration vector, always points radially inward. Roll this sheet of paper into a cylinder. Now the (Euclidean) derivative of your velocity has a component that sometimes points inward toward the axis of the cylinder depending on whether you're near a solstice or an equinox. (At the point of the circle when you are moving parallel to the axis, there is no inward acceleration. Conversely, at a point (1/4 of a circle later) when the velocity is along the cylinder's bend, the inward acceleration is maximum.) This is the (Euclidean) normal component. The covariant derivative component is the component parallel to the cylinder's surface, and is the same as that before you rolled the sheet into a cylinder.

==Formal definition==
A covariant derivative is a [[connection (vector bundle)|(Koszul) connection]] on the [[tangent bundle]] and other [[tensor bundle]]s: it differentiates vector fields in a way analogous to the usual differential on functions. The definition extends to a differentiation on the dual of vector fields (i.e. [[cotangent space|covector]] fields) and to arbitrary [[tensor field]]s, in a unique way that ensures compatibility with the tensor product and trace operations (tensor contraction).

===Functions===
Given a point <math>p \in M</math> of the manifold {{mvar|M}}, a real function <math>f : M \to \R</math> on the manifold and a tangent vector <math>\mathbf{v} \in T_pM</math>, the covariant derivative of {{mvar|f}} at {{mvar|p}} along {{math|'''v'''}} is the scalar at {{mvar|p}}, denoted <math>\left(\nabla_\mathbf{v} f\right)_p</math>, that represents the [[Principal part#Calculus|principal part]] of the change in the value of {{mvar|f}} when the argument of {{mvar|f}} is changed by the infinitesimal displacement vector {{math|'''v'''}}. (This is the [[differential of a function|differential]] of {{mvar|f}} evaluated against the vector {{math|'''v'''}}.) Formally, there is a differentiable curve <math>\phi:[-1, 1]\to M</math> such that <math>\phi(0) = p</math> and <math>\phi'(0) = \mathbf{v}</math>, and the covariant derivative of {{mvar|f}} at {{mvar|p}} is defined by
<math display="block">\left(\nabla_\mathbf{v} f\right)_p = \left(f \circ \phi\right)^\prime \left(0\right) = \lim_{t \to 0} \frac{ f(\phi\left(t\right)) - f(p) }{t}.</math>

When <math>\mathbf{v} : M \to T_pM</math> is a vector field on {{mvar|M}}, the covariant derivative <math>\nabla_\mathbf{v}f : M \to \R </math> is the function that associates with each point {{mvar|p}} in the common domain of {{mvar|f}} and {{math|'''v'''}} the scalar <math>\left(\nabla_\mathbf{v}f\right)_p</math>.

For a scalar function {{mvar|f}} and vector field {{math|'''v'''}}, the covariant derivative <math>\nabla_\mathbf{v} f</math> coincides with the [[Lie derivative]] <math>L_v(f)</math>, and with the [[exterior derivative]] <math>df(v)</math>.

===Vector fields===
Given a point {{mvar|p}} of the manifold {{mvar|M}}, a vector field <math>\mathbf{u} : M \to T_p M</math> defined in a neighborhood of {{mvar|p}} and a tangent vector <math>\mathbf{v} \in T_pM</math>, the covariant derivative of {{math|'''u'''}} at {{mvar|p}} along {{math|'''v'''}} is the tangent vector at {{mvar|p}}, denoted <math>(\nabla_\mathbf{v} \mathbf{u})_p</math>, such that the following properties hold (for any tangent vectors {{math|'''v'''}}, {{math|'''x'''}} and {{math|'''y'''}} at {{mvar|p}}, vector fields {{math|'''u'''}} and {{math|'''w'''}} defined in a neighborhood of {{mvar|p}}, scalar values {{mvar|g}} and {{mvar|h}} at {{mvar|p}}, and scalar function {{mvar|f}} defined in a neighborhood of {{mvar|p}}):
# <math>\left(\nabla_\mathbf{v} \mathbf{u}\right)_p</math> is linear in <math>\mathbf{v}</math> so <math display="block">\left(\nabla_{g\mathbf{x} + h\mathbf{y}} \mathbf{u}\right)_p =  g(p)  \left(\nabla_\mathbf{x} \mathbf{u}\right)_p + h(p) \left(\nabla_\mathbf{y} \mathbf{u}\right)_p</math>
# <math>\left(\nabla_\mathbf{v} \mathbf{u}\right)_p</math> is additive in <math>\mathbf{u}</math> so: <math display="block">\left(\nabla_\mathbf{v}\left[\mathbf{u} + \mathbf{w}\right]\right)_p = \left(\nabla_\mathbf{v} \mathbf{u}\right)_p + \left(\nabla_\mathbf{v} \mathbf{w}\right)_p</math>
# <math>(\nabla_\mathbf{v} \mathbf{u})_p</math> obeys the [[product rule]]; i.e., where <math>\nabla_\mathbf{v}f</math> is defined above, <math display="block">\left(\nabla_\mathbf{v} \left[f\mathbf{u}\right]\right)_p = f(p)\left(\nabla_\mathbf{v} \mathbf{u})_p + (\nabla_\mathbf{v}f\right)_p\mathbf{u}_p.</math>

Note that <math>\left(\nabla_\mathbf{v} \mathbf{u}\right)_p</math> depends not only on the value of {{math|'''u'''}} at {{mvar|p}} but also on values of {{math|'''u'''}} in a neighborhood of {{mvar|p}}, because the last property, the product rule, involves the directional derivative of {{mvar|f}} (by the vector {{math|'''v'''}}).

If {{math|'''u'''}} and {{math|'''v'''}} are both vector fields defined over a common domain, then <math>\nabla_\mathbf{v}\mathbf u</math> denotes the vector field whose value at each point {{mvar|p}} of the domain is the tangent vector <math>\left(\nabla_\mathbf{v}\mathbf u\right)_p</math>.

===Covector fields===
Given a field of [[Cotangent space|covectors]] (or [[one-form]]) <math>\alpha</math> defined in a neighborhood of {{mvar|p}}, its covariant derivative <math>(\nabla_\mathbf{v}\alpha)_p</math> is defined in a way to make the resulting operation compatible with tensor contraction and the product rule. That is, <math>(\nabla_\mathbf{v}\alpha)_p</math> is defined as the unique one-form at {{mvar|p}} such that the following identity is satisfied for all vector fields {{math|'''u'''}} in a neighborhood of {{mvar|p}}
<math display="block">\left(\nabla_\mathbf{v}\alpha\right)_p \left(\mathbf{u}_p\right) = \nabla_\mathbf{v}\left[\alpha\left(\mathbf{u}\right)\right]_p - \alpha_p\left[\left(\nabla_\mathbf{v}\mathbf{u}\right)_p\right].</math>

The covariant derivative of a covector field along a vector field {{math|'''v'''}} is again a covector field.

===Tensor fields===
Once the covariant derivative is defined for fields of vectors and covectors it can be defined for arbitrary [[Tensor (intrinsic definition)|tensor]] fields by imposing the following identities for every pair of tensor fields <math> \varphi</math> and <math>\psi </math> in a neighborhood of the point {{mvar|p}}:
<math display="block">\nabla_\mathbf{v}\left(\varphi \otimes \psi\right)_p = \left(\nabla_\mathbf{v}\varphi\right)_p \otimes \psi(p) + \varphi(p) \otimes \left(\nabla_\mathbf{v}\psi\right)_p,</math>
and for <math>\varphi</math> and <math>\psi</math> of the same valence
<math display="block">\nabla_\mathbf{v}(\varphi + \psi)_p = (\nabla_\mathbf{v}\varphi)_p + (\nabla_\mathbf{v}\psi)_p.</math>
The covariant derivative of a tensor field along a vector field {{math|'''v'''}} is again a tensor field of the same type.

Explicitly, let {{mvar|T}} be a tensor field of type {{math|(''p'', ''q'')}}. Consider {{mvar|T}} to be a differentiable [[multilinear map]] of [[smooth function|smooth]] [[section (fiber bundle)|sections]] {{math|''α''{{isup|1}}, ''α''{{isup|2}}, ..., ''α''<sup>''q''</sup>}} of the cotangent bundle {{math|''T''{{isup|∗}}''M''}} and of sections {{math|''X''{{sub|1}}, ''X''{{sub|2}}, ..., ''X''<sub>''p''</sub>}} of the [[tangent bundle]] {{math|''TM''}}, written {{math|''T''(''α''{{isup|1}}, ''α''{{isup|2}}, ..., ''X''{{sub|1}}, ''X''{{sub|2}}, ...)}} into {{math|'''R'''}}. The covariant derivative of {{mvar|T}} along {{mvar|Y}} is given by the formula

<math display="block">\begin{align}
(\nabla_Y T)\left(\alpha_1, \alpha_2, \ldots, X_1, X_2, \ldots\right) =
&{} \nabla_Y\left(T\left(\alpha_1,\alpha_2, \ldots, X_1, X_2, \ldots\right)\right) \\
&{}- T\left(\nabla_Y\alpha_1, \alpha_2, \ldots, X_1, X_2, \ldots\right) - T\left(\alpha_1, \nabla_Y\alpha_2, \ldots, X_1, X_2, \ldots\right) - \cdots \\
&{}- T\left(\alpha_1, \alpha_2, \ldots, \nabla_YX_1, X_2, \ldots\right) - T\left(\alpha_1, \alpha_2, \ldots, X_1, \nabla_Y X_2, \ldots\right) - \cdots
\end{align}</math>

==Coordinate description==
{{Hatnote|This section uses the [[Einstein summation convention]].}}
Given coordinate functions <math display="block">x^i,\ i=0,1,2,\dots ,</math> any [[tangent vector]] can be described by its components in the basis <math display="block">\mathbf{e}_i = \frac{\partial}{\partial x^i} .</math>

The covariant derivative of a basis vector along a basis vector is again a vector and so can be expressed as a linear combination <math>\Gamma^k \mathbf{e}_k</math>.
To specify the covariant derivative it is enough to specify the covariant derivative of each basis vector field <math>\mathbf{e}_i</math> along <math>\mathbf{e}_j</math>.
<math display="block"> \nabla_{\mathbf{e}_j} \mathbf{e}_i = {\Gamma^k}_{i j} \mathbf{e}_k,</math>

the coefficients <math>\Gamma^k_{i j}</math> are the components of the connection with respect to a system of local coordinates. In the theory of Riemannian and pseudo-Riemannian manifolds, the components of the Levi-Civita connection with respect to a system of local coordinates are called [[Christoffel symbols]].

Then using the rules in the definition, we find that for general vector fields <math>\mathbf{v} = v^j \mathbf{e}_j </math> and <math>\mathbf{u} = u^i \mathbf{e}_i</math> we get
<math display="block">\begin{align}
 \nabla_\mathbf{v} \mathbf{u}
&= \nabla_{v^j \mathbf{e}_j} u^i \mathbf{e}_i \\
&= v^j \nabla_{\mathbf{e}_j} u^i \mathbf{e}_i \\
&= v^j u^i \nabla_{\mathbf{e}_j} \mathbf{e}_i + v^j \mathbf{e}_i \nabla_{\mathbf{e}_j} u^i \\
&= v^j u^i {\Gamma^k}_{i j}\mathbf{e}_k + v^j{\partial u^i \over \partial x^j} \mathbf{e}_i
\end{align}</math>

so
<math display="block"> \nabla_\mathbf{v} \mathbf{u} = \left(v^j u^i {\Gamma^k}_{i j} + v^j {\partial u^k\over\partial x^j} \right)\mathbf{e}_k .</math>

The first term in this formula is responsible for "twisting" the coordinate system with respect to the covariant derivative and the second for changes of components of the vector field {{mvar|u}}. In particular
<math display="block">\nabla_{\mathbf{e}_j} \mathbf{u} = \nabla_j \mathbf{u} = \left( \frac{\partial u^i}{\partial x^j} + u^k {\Gamma^i}_{kj} \right) \mathbf{e}_i </math>

In words: the covariant derivative is the usual derivative along the coordinates with correction terms which tell how the coordinates change.

For covectors similarly we have
<math display="block">\nabla_{\mathbf{e}_j} {\mathbf \theta} = \left( \frac{\partial \theta_i}{\partial x^j} - \theta_k {\Gamma^k}_{ij} \right) {\mathbf e^*}^i </math>

where <math>{\mathbf e^*}^i (\mathbf{e}_j) = {\delta^i}_j</math>.

The covariant derivative of a type {{math|(''r'', ''s'')}} tensor field along <math>e_c</math> is given by the expression:

<math display="block">\begin{align}
{(\nabla_{e_c} T)^{a_1 \ldots a_r}}_{b_1 \ldots b_s} = {}
&\frac{\partial}{\partial x^c}{T^{a_1 \ldots a_r}}_{b_1 \ldots b_s} \\
&+ \,{\Gamma ^{a_1}}_{dc} {T^{d a_2 \ldots a_r}}_{b_1 \ldots b_s} + \cdots + {\Gamma^{a_r}}_{dc} {T^{a_1 \ldots a_{r-1}d}}_{b_1 \ldots b_s} \\
&-\,{\Gamma^d}_{b_1 c} {T^{a_1 \ldots a_r}}_{d b_2 \ldots b_s} - \cdots - {\Gamma^d}_{b_s c} {T^{a_1 \ldots a_r}}_{b_1 \ldots b_{s-1} d}.
\end{align}</math>

Or, in words: take the partial derivative of the tensor and add: <math>+{\Gamma^{a_i}}_{dc}</math> for every upper index <math>a_i</math>, and <math>-{\Gamma^d}_{b_ic}</math> for every lower index <math>b_i</math>.

If instead of a tensor, one is trying to differentiate a ''[[tensor density]]'' (of weight +1), then one also adds a term
<math display="block">-{\Gamma^d}_{d c} {T^{a_1 \ldots a_r}}_{b_1 \ldots b_s}.</math>
If it is a tensor density of weight {{mvar|W}}, then multiply that term by {{mvar|W}}.
For example, <math display="inline"> \sqrt{-g}</math> is a scalar density (of weight +1), so we get:
<math display="block">\left(\sqrt{-g}\right)_{;c} = \left(\sqrt{-g}\right)_{,c} - \sqrt{-g}\,{\Gamma^d}_{d c}</math>

where semicolon ";" indicates covariant differentiation and comma "," indicates partial differentiation. Incidentally, this particular expression is equal to zero, because the covariant derivative of a function solely of the metric is always zero.

== Notation ==
In textbooks on physics, the covariant derivative is sometimes simply stated in terms of its components in this equation.

Often a notation is used in which the covariant derivative is given with a [[semicolon]], while a normal [[partial derivative]] is indicated by a [[comma]]. In this notation we write the same as:
<math display="block">
\nabla_{e_j} \mathbf{v} \ \stackrel{\mathrm{def}}{=}\ {v^s}_{;j}\mathbf{e}_s \;\;\;\;\;\;
{v^i}_{;j} =
{v^i}_{,j} + v^k {\Gamma^i}_{k j}
</math>
In case two or more indexes appear after the semicolon, all of them must be understood as covariant derivatives:
<math display="block"> \nabla_{e_k} \left( \nabla_{e_j} \mathbf{v} \right) \ \stackrel{\mathrm{def}}{=}\ {v^s}_{;jk}\mathbf{e}_s </math>

In some older texts (notably Adler, Bazin & Schiffer, ''Introduction to General Relativity''), the covariant derivative is denoted by a double pipe and the partial derivative by single pipe:
<math display="block">\nabla_{e_j} \mathbf{v} \ \stackrel{\mathrm{def}}{=}\ {v^i}_{||j} = {v^i}_{|j} + v^k {\Gamma^i}_{k j}</math>

==Covariant derivative by field type==
For a scalar field <math> \phi\,</math>, covariant differentiation is simply partial differentiation:
<math display="block"> \phi_{;a} \equiv \partial_a \phi</math>

For a contravariant vector field <math>\lambda^a</math>, we have:
<math display="block">{\lambda^a}_{;b} \equiv \partial_b \lambda^a + {\Gamma^a}_{bc}\lambda^c</math>

For a covariant vector field <math>\lambda_a</math>, we have:
<math display="block">\lambda_{a;c} \equiv \partial_c \lambda_a - {\Gamma^b}_{c a}\lambda_b</math>

For a type (2,0) tensor field <math>\tau^{a b}</math>, we have:
<math display="block">{\tau^{a b}}_{;c} \equiv \partial_c \tau^{a b} + {\Gamma^a}_{c d}\tau^{d b} + {\Gamma^b}_{c d}\tau^{a d} </math>

For a type (0,2) tensor field <math>\tau_{a b}</math>, we have:
<math display="block">\tau_{a b ;c} \equiv \partial_c \tau_{a b} - {\Gamma^d}_{c a}\tau_{d b} - {\Gamma^d}_{c b}\tau_{a d}</math>

For a type (1,1) tensor field <math>{\tau^{a}}_{b}</math>, we have:
<math display="block">{\tau^a}_{b;c}\equiv \partial_c {\tau^a}_b + {\Gamma^a}_{c d}{\tau^d}_b - {\Gamma^d}_{c b} {\tau^a}_d </math>

The notation above is meant in the sense
<math display="block">{\tau^{a b}}_{;c} \equiv \left(\nabla_{\mathbf{e}_c}\tau\right)^{a b}</math>

== Properties ==
In general, covariant derivatives do not commute. By example, the covariant derivatives of vector field <math>\lambda_{a;bc} \neq \lambda_{a;cb}</math>. The [[Riemann tensor]] <math>{R^d}_{abc} </math> is defined such that:
<math display="block"> \lambda_{a;bc} - \lambda_{a;cb} = {R^d}_{abc}\lambda_d</math>

or, equivalently,
<math display="block"> {\lambda^a}_{;bc} - {\lambda^a}_{;cb} = -{R^a}_{dbc}\lambda^d</math>

The covariant derivative of a (2,0)-tensor field fulfills:
<math display="block"> {\tau^{ab}}_{;cd} - {\tau^{ab}}_{;dc} = -{R^a}_{ecd}\tau^{eb} - {R^b}_{ecd}\tau^{ae}</math>

The latter can be shown by taking (without loss of generality) that <math>\tau^{ab} = \lambda^a \mu^b </math>.

==Derivative along a curve==
Since the covariant derivative <math>\nabla_X T</math> of a tensor field {{mvar|T}} at a point {{mvar|p}} depends only on the value of the vector field {{mvar|X}} at {{mvar|p}} one can define the covariant derivative along a smooth curve <math>\gamma(t)</math> in a manifold:
<math display="block">D_tT=\nabla_{\dot\gamma(t)}T.</math>
Note that the tensor field {{mvar|T}} only needs to be defined on the curve <math>\gamma(t)</math> for this definition to make sense.

In particular, <math>\dot{\gamma}(t)</math> is a vector field along the curve <math>\gamma</math> itself. If <math>\nabla_{\dot\gamma(t)}\dot\gamma(t)</math> vanishes then the curve is called a geodesic of the covariant derivative. If the covariant derivative is the [[Levi-Civita connection]] of a [[Metric tensor|positive-definite metric]] then the geodesics for the connection are precisely the [[geodesics]] of the metric that are parametrized by [[Arc length#Generalization to (pseudo-)Riemannian manifolds|arc length]].

The derivative along a curve is also used to define the [[parallel transport]] along the curve.

Sometimes the covariant derivative along a curve is called '''absolute''' or '''intrinsic derivative'''.

==Relation to Lie derivative==
A covariant derivative introduces an extra geometric structure on a manifold that allows vectors in neighboring tangent spaces to be compared: there is no canonical way to compare vectors from different tangent spaces because there is no canonical coordinate system.

There is however another generalization of directional derivatives which ''is'' canonical: the [[Lie derivative]], which evaluates the change of one vector field along the flow of another vector field. Thus, one must know both vector fields in a neighborhood, not merely at a single point. The covariant derivative on the other hand introduces its own change for vectors in a given direction, and it only depends on the vector direction at a single point, rather than a vector field in a neighborhood of a point. In other words, the covariant derivative is linear (over {{math|''C''{{isup|∞}}(''M'')}}) in the direction argument, while the Lie derivative is linear in neither argument.

Note that the antisymmetrized covariant derivative {{math|∇<sub>''u''</sub>''v'' − ∇<sub>''v''</sub>''u''}}, and the Lie derivative {{math|''L''<sub>''u''</sub>''v''}} differ by the [[torsion of connection|torsion of the connection]], so that if a connection is torsion free, then its antisymmetrization ''is'' the Lie derivative.

==See also==
{{div col|colwidth=22em}}
* [[Affine connection]]
* [[Christoffel symbols]]
* [[Connection (algebraic framework)]]
* [[Connection (mathematics)]]
* [[Connection (vector bundle)]]
* [[Connection form]]
* [[Exterior covariant derivative]]
* [[Gauge covariant derivative]]
* [[Introduction to the mathematics of general relativity]]
* [[Levi-Civita connection]]
* [[Parallel transport]]
* [[Ricci calculus]]
* [[Tensor derivative (continuum mechanics)]]
* [[List of formulas in Riemannian geometry]]
{{div col end}}

==Notes==
{{Reflist}}

==References==

* {{Cite book|author1=Kobayashi, Shoshichi |author-link1=Shoshichi Kobayashi |author2=Nomizu, Katsumi |author-link2=Katsumi Nomizu| title = [[Foundations of Differential Geometry|Foundations of Differential Geometry, Vol. 1]] | publisher=[[Wiley Interscience]] | year=1996|edition=New |isbn = 0-471-15733-3}}
* {{springer|id=c/c026870|title=Covariant differentiation|author=I.Kh. Sabitov}}
* {{Cite book|first=Shlomo|last=Sternberg|author-link=Shlomo Sternberg|title=Lectures on Differential Geometry|year=1964|publisher=Prentice-Hall}}
* {{Cite book|first=Michael|last=Spivak|author-link=Michael Spivak|title=A Comprehensive Introduction to Differential Geometry (Volume Two)|publisher=Publish or Perish, Inc.|year=1999}}

{{Manifolds}}
{{Riemannian geometry}}
{{tensors}}

{{DEFAULTSORT:Covariant Derivative}}

[[Category:Connection (mathematics)]]
[[Category:Differential geometry]]
[[Category:Mathematical methods in general relativity]]
[[Category:Riemannian geometry]]
[[Category:Solid mechanics]]