Editing Connection (mathematics) (section)

==Motivation: the unsuitability of coordinates==
[[File:Connection-on-sphere.png|frame|Parallel transport (of the black arrow) on a sphere. Blue and red arrows represent parallel transports in different directions but ending at the same lower right point. The fact that they end up pointing in different directions is a result of the curvature of the sphere.]]
Consider the following problem.  Suppose that a tangent vector to the sphere ''S'' is given at the north pole, and we are to define a manner of consistently moving this vector to other points of the sphere: a means for ''parallel transport''.  Naively, this could be done using a particular [[coordinate system]].  However, unless proper care is applied, the parallel transport defined in one system of coordinates will not agree with that of another coordinate system.  A more appropriate parallel transportation system exploits the symmetry of the sphere under rotation.  Given a vector at the north pole, one can transport this vector along a curve by rotating the sphere in such a way that the north pole moves along the curve without axial rolling.  This latter means of parallel transport is the [[Levi-Civita connection]] on the sphere.  If two different curves are given with the same initial and terminal point, and a vector ''v'' is rigidly moved along the first curve by a rotation, the resulting vector at the terminal point will be ''different from'' the vector resulting from rigidly moving ''v'' along the second curve.  This phenomenon reflects the [[curvature]] of the sphere. A simple mechanical device that can be used to visualize parallel transport is the [[south-pointing chariot]].

For instance, suppose that ''S'' is a sphere given coordinates by the [[stereographic projection]].  Regard ''S'' as consisting of unit vectors in '''R'''<sup>3</sup>.  Then ''S'' carries a pair of [[Atlas (topology)#Charts|coordinate patches]] corresponding to the projections from north pole and south pole. The mappings
:<math>
\begin{align}
\varphi_0(x,y) & = \left(\frac{2x}{1+x^2+y^2}, \frac{2y}{1+x^2+y^2}, \frac{1-x^2-y^2}{1+x^2+y^2}\right)\\[8pt]
\varphi_1(x,y) & = \left(\frac{2x}{1+x^2+y^2}, \frac{2y}{1+x^2+y^2}, \frac{x^2+y^2-1}{1+x^2+y^2}\right)
\end{align}
</math>
cover a neighborhood ''U''<sub>0</sub> of the north pole and ''U''<sub>1</sub> of the south pole, respectively.  Let ''X'', ''Y'', ''Z'' be the ambient coordinates in '''R'''<sup>3</sup>.  Then φ<sub>0</sub> and φ<sub>1</sub> have inverses
:<math>
\begin{align}
\varphi_0^{-1}(X,Y,Z) &= \left(\frac{X}{Z+1}, \frac{Y}{Z+1}\right), \\[8pt]
\varphi_1^{-1}(X,Y,Z) &= \left(\frac{-X}{Z-1}, \frac{-Y}{Z-1}\right),
\end{align}
</math>
so that the coordinate transition function is [[inversion in a circle|inversion in the circle]]:

:<math>\varphi_{01}(x,y) = \varphi_0^{-1}\circ\varphi_1(x,y) = \left(\frac{x}{x^2+y^2},\frac{y}{x^2+y^2}\right)</math>

Let us now represent a [[vector field]] <math>v</math> on S (an assignment of a tangent vector to each point in S) in local coordinates.  If ''P'' is a point of ''U''<sub>0</sub> ⊂ ''S'', then a vector field may be represented by the [[Pushforward (differential)|pushforward]] of a vector field '''v'''<sub>0</sub> on '''R'''<sup>2</sup> by <math>\varphi_0</math>:

{{NumBlk|:|<math>v(P) = J_{\varphi_0}\left(\varphi_0^{-1}(P)\right) \cdot {\mathbf v}_0\left(\varphi_0^{-1}(P)\right) </math>|{{EquationRef|1}}}}

where <math>J_{\varphi_0}</math> denotes the [[Jacobian matrix]] of φ<sub>0</sub> (<math>d{\varphi_0}_x({\mathbf u}) =  J_{\varphi_0}(x)\cdot {\mathbf u}</math>), and '''v'''<sub>0</sub>&nbsp;=&nbsp;'''v'''<sub>0</sub>(''x'',&nbsp;''y'') is a vector field on '''R'''<sup>2</sup> uniquely determined by ''v'' (since the pushforward of a [[local diffeomorphism]] at any point is invertible).  Furthermore, on the overlap between the coordinate charts ''U''<sub>0</sub> ∩ ''U''<sub>1</sub>, it is possible to represent the same vector field with respect to the φ<sub>1</sub> coordinates:

{{NumBlk|:|<math>v(P) = J_{\varphi_1}\left(\varphi_1^{-1}(P)\right) \cdot {\mathbf v}_1\left(\varphi_1^{-1}(P)\right). </math>|{{EquationRef|2}}}}

To relate the components '''v'''<sub>0</sub> and '''v'''<sub>1</sub>, apply the [[chain rule]] to the identity φ<sub>1</sub> = φ<sub>0</sub> o φ<sub>01</sub>:

:<math>J_{\varphi_1}\left(\varphi_1^{-1}(P)\right) = J_{\varphi_0}\left(\varphi_0^{-1}(P)\right) \cdot J_{\varphi_{01}}\left(\varphi_1^{-1}(P)\right). </math>

Applying both sides of this matrix equation to the component vector '''v'''<sub>1</sub>(φ<sub>1</sub><sup>−1</sup>(''P'')) and invoking ({{EquationNote|1}}) and ({{EquationNote|2}}) yields
{{NumBlk|:|<math>{\mathbf v}_0\left(\varphi_0^{-1}(P)\right) = J_{\varphi_{01}}\left(\varphi_1^{-1}(P)\right) \cdot {\mathbf v}_1 \left(\varphi_1^{-1}(P)\right).</math>|{{EquationRef|3}}}}

We come now to the main question of defining how to transport a vector field parallelly along a curve.  Suppose that ''P''(''t'') is a curve in ''S''.  Naïvely, one may consider a vector field parallel if the coordinate components of the vector field are constant along the curve.  However, an immediate ambiguity arises: in ''which'' coordinate system should these components be constant?

For instance, suppose that ''v''(''P''(''t'')) has constant components in the ''U''<sub>1</sub> coordinate system.  That is, the functions '''v'''<sub>1</sub>(''φ''<sub>1</sub><sup>&minus;1</sup>(''P''(''t''))) are constant.  However, applying the [[product rule]] to ({{EquationNote|3}}) and using the fact that ''d'''''v'''<sub>1</sub>/''dt'' = 0 gives

:<math>\frac{d}{dt}{\mathbf v}_0\left(\varphi_0^{-1}(P(t))\right) = \left(\frac{d}{dt}J_{\varphi_{01}}\left(\varphi_1^{-1}(P(t))\right)\right) \cdot {\mathbf v}_1\left(\varphi_1^{-1}\left(P(t)\right)\right).</math>

But <math>\left(\frac{d}{dt}J_{\varphi_{01}}\left(\varphi_1^{-1}(P(t))\right)\right)</math> is always a non-singular matrix (provided that the curve ''P''(''t'') is not stationary), so '''v'''<sub>1</sub> and '''v'''<sub>0</sub> ''cannot ever be'' simultaneously constant along the curve.

===Resolution===
The problem observed above is that the usual [[directional derivative]] of [[vector calculus]] does not behave well under changes in the coordinate system when applied to the components of vector fields.  This makes it quite difficult to describe how to translate vector fields in a parallel manner, if indeed such a notion makes any sense at all.  There are two fundamentally different ways of resolving this problem.

The first approach is to examine what is required for a generalization of the directional derivative to "behave well" under coordinate transitions.  This is the tactic taken by the [[covariant derivative]] approach to connections: good behavior is equated with [[covariance and contravariance of vectors|covariance]].  Here one considers a modification of the directional derivative by a certain [[linear operator]], whose components are called the [[Christoffel symbols]], which involves no derivatives on the vector field itself.  The directional derivative ''D''<sub>'''u'''</sub>'''v''' of the components of a vector '''v''' in a coordinate system ''φ'' in the direction '''u''' are replaced by a ''covariant derivative'':

:<math>\nabla_{\mathbf u} {\mathbf v} = D_{\mathbf u} {\mathbf v} + \Gamma(\varphi)\{{\mathbf u},{\mathbf v}\}</math>

where Γ depends on the coordinate system ''φ'' and is [[Bilinear form|bilinear]] in '''u''' and '''v'''.  In particular, Γ does not involve any derivatives on '''u''' or '''v'''.  In this approach, Γ must transform in a prescribed manner when the coordinate system ''φ'' is changed to a different coordinate system.  This transformation is not [[tensor]]ial, since it involves not only the ''first derivative'' of the coordinate transition, but also its ''second derivative''.  Specifying the transformation law of Γ is not sufficient to determine Γ  uniquely.  Some other normalization conditions must be imposed, usually depending on the type of geometry under consideration.  In [[Riemannian geometry]], the [[Levi-Civita connection]] requires compatibility of the [[Christoffel symbols]] with the [[Riemannian metric|metric]] (as well as a certain symmetry condition).  With these normalizations, the connection is uniquely defined.

The second approach is to use [[Lie group]]s to attempt to capture some vestige of symmetry on the space.  This is the approach of [[Cartan connection]]s.  The example above using rotations to specify the parallel transport of vectors on the sphere is very much in this vein.