Editing Conjugate gradient method (section)

==Derivation as a direct method==

{{Main|Derivation of the conjugate gradient method}}

The conjugate gradient method can be derived from several different perspectives, including specialization of the conjugate direction method for optimization, and variation of the [[Arnoldi iteration|Arnoldi]]/[[Lanczos iteration|Lanczos]] iteration for [[eigenvalue]] problems. Despite differences in their approaches, these derivations share a common topic—proving the orthogonality of the residuals and conjugacy of the search directions. These two properties are crucial to developing the well-known succinct formulation of the method.

We say that two non-zero vectors <math>\mathbf{u}</math> and <math>\mathbf{v}</math> are conjugate (with respect to <math>\mathbf{A}</math>) if

:<math> \mathbf{u}^\mathsf{T} \mathbf{A} \mathbf{v} = 0. </math>

Since <math>\mathbf{A}</math> is symmetric and positive-definite, the left-hand side defines an [[inner product space|inner product]]

:<math>
 \mathbf{u}^\mathsf{T} \mathbf{A} \mathbf{v} =
 \langle \mathbf{u}, \mathbf{v} \rangle_\mathbf{A} :=
 \langle \mathbf{A} \mathbf{u}, \mathbf{v}\rangle =
 \langle \mathbf{u}, \mathbf{A}^\mathsf{T} \mathbf{v}\rangle =
 \langle \mathbf{u}, \mathbf{A}\mathbf{v}\rangle.
</math>

Two vectors are conjugate if and only if they are orthogonal with respect to this inner product. Being conjugate is a symmetric relation: if <math>\mathbf{u}</math> is conjugate to <math>\mathbf{v}</math>, then <math>\mathbf{v}</math> is conjugate to <math>\mathbf{u}</math>. Suppose that

:<math>P = \{ \mathbf{p}_1, \dots, \mathbf{p}_n \}</math>

is a set of <math>n</math> mutually conjugate vectors with respect to <math>\mathbf{A}</math>, i.e. <math>\mathbf{p}_i^\mathsf{T} \mathbf{A} \mathbf{p}_j = 0</math> for all <math>i \neq j</math>. 
Then <math>P</math> forms a [[basis (linear algebra)|basis]] for <math>\mathbb{R}^n</math>, and we may express the solution <math>\mathbf{x}_*</math> of <math>\mathbf{Ax} = \mathbf{b}</math> in this basis:

:<math>\mathbf{x}_* = \sum^{n}_{i=1} \alpha_i \mathbf{p}_i \Rightarrow \mathbf{A} \mathbf{x}_* = \sum^{n}_{i=1} \alpha_i \mathbf{A} \mathbf{p}_i.</math>

Left-multiplying the problem <math>\mathbf{Ax} = \mathbf{b}</math> with the vector <math>\mathbf{p}_k^\mathsf{T}</math> yields

:<math>
\mathbf{p}_k^\mathsf{T} \mathbf{b} 
= \mathbf{p}_k^\mathsf{T} \mathbf{A} \mathbf{x}_* 
= \sum^{n}_{i=1} \alpha_i \mathbf{p}_k^\mathsf{T} \mathbf{A} \mathbf{p}_i 
= \sum^{n}_{i=1} \alpha_i \left \langle \mathbf{p}_k, \mathbf{p}_i \right \rangle_{\mathbf{A}} 
= \alpha_k \left \langle \mathbf{p}_k, \mathbf{p}_k \right \rangle_{\mathbf{A}} </math>
and so
:<math>\alpha_k = \frac{\langle \mathbf{p}_k, \mathbf{b} \rangle}{\langle \mathbf{p}_k, \mathbf{p}_k \rangle_\mathbf{A}}.</math>

This gives the following method<ref name="BP" /> for solving the equation <math>\mathbf{Ax} = \mathbf{b}</math>: find a sequence of <math>n</math> conjugate directions, and then compute the coefficients <math>\alpha_k</math>.