Editing Gram–Schmidt process

{{Short description|Orthonormalization of a set of vectors}}
[[File:Gram–Schmidt process.svg|right|frame|The first two steps of the Gram–Schmidt process]]
In [[mathematics]], particularly [[linear algebra]] and [[numerical analysis]], the '''Gram–Schmidt process''' or Gram-Schmidt algorithm is a way of finding a set of two or more vectors that are perpendicular to each other.

By technical definition, it is a method of constructing an [[orthonormal basis]] from a set of [[vector (geometry)|vectors]] in an [[inner product space]], most commonly the [[Euclidean space]] <math>\mathbb{R}^n</math> equipped with the [[standard inner product]]. The Gram–Schmidt process takes a [[finite set|finite]], [[linearly independent]] set of vectors <math>S = \{ \mathbf{v}_1, \ldots , \mathbf{v}_k \}</math> for {{math|''k'' ≤ ''n''}} and generates an [[orthogonal set]]  <math>S' = \{ \mathbf{u}_1 , \ldots , \mathbf{u}_k \}</math> that spans the same <math>k</math>-dimensional subspace of <math>\mathbb{R}^n</math> as <math>S</math>.

The method is named after [[Jørgen Pedersen Gram]] and [[Erhard Schmidt]], but [[Pierre-Simon Laplace]] had been familiar with it before Gram and Schmidt.<ref>{{cite book |last1=Cheney |first1=Ward |last2=Kincaid |first2=David |title=Linear Algebra: Theory and Applications |location=Sudbury, Ma |publisher=Jones and Bartlett |year=2009 |url={{Google books |plainurl=yes |id=Gg3Uj1GkHK8C |page=544 }} |isbn=978-0-7637-5020-6 |pages=544, 558 }}</ref> In the theory of [[Lie group decompositions]], it is generalized by the [[Iwasawa decomposition]].

The application of the Gram–Schmidt process to the column vectors of a full column [[rank (linear algebra)|rank]] [[matrix (mathematics)|matrix]] yields the [[QR decomposition]] (it is decomposed into an [[orthogonal matrix|orthogonal]] and a [[triangular matrix]]).

== The Gram–Schmidt process  ==
[[File:Gram-Schmidt orthonormalization process.gif|thumb|400px|The modified Gram-Schmidt process being executed on three linearly independent, non-orthogonal vectors of a basis for <math>\mathbb{R}^3</math>. Click on image for details. Modification is explained in the Numerical Stability section of this article.]]

The [[vector projection]] of a vector <math>\mathbf v</math> on a nonzero vector <math>\mathbf u</math> is defined as<ref group="note">In the complex case, this assumes that the inner product is linear in the first argument and conjugate-linear in the second. In physics a more common convention is linearity in the second argument, in which case we define <math display="block">\operatorname{proj}_{\mathbf u} (\mathbf{v}) = \frac{\langle \mathbf{u}, \mathbf{v}\rangle}{\langle \mathbf{u}, \mathbf{u}\rangle} \,\mathbf{u}. </math></ref>
<math display="block">\operatorname{proj}_{\mathbf u} (\mathbf{v}) = \frac{\langle \mathbf{v}, \mathbf{u}\rangle}{\langle \mathbf{u}, \mathbf{u}\rangle} \,\mathbf{u} , </math>
where <math>\langle \mathbf{v}, \mathbf{u}\rangle</math> denotes the [[inner product]] of the vectors <math>\mathbf u</math> and <math>\mathbf v</math>. This means that <math>\operatorname{proj}_{\mathbf u} (\mathbf{v})</math> is the [[orthogonal projection]] of <math>\mathbf v</math> onto the line spanned by <math>\mathbf u</math>. If <math>\mathbf u</math> is the zero vector, then <math>\operatorname{proj}_{\mathbf u} (\mathbf{v})</math> is defined as the zero vector.

Given <math>k</math> nonzero linearly-independent vectors <math>\mathbf{v}_1, \ldots, \mathbf{v}_k</math> the Gram–Schmidt process defines the vectors <math>\mathbf{u}_1, \ldots, \mathbf{u}_k</math> as follows:
<math display="block">\begin{align}
\mathbf{u}_1 & = \mathbf{v}_1, & \!\mathbf{e}_1 & = \frac{\mathbf{u}_1}{\|\mathbf{u}_1\|} \\
\mathbf{u}_2 & = \mathbf{v}_2-\operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_2),
& \!\mathbf{e}_2 & = \frac{\mathbf{u}_2}{\|\mathbf{u}_2\|} \\
\mathbf{u}_3 & = \mathbf{v}_3-\operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_3) - \operatorname{proj}_{\mathbf{u}_2} (\mathbf{v}_3),
& \!\mathbf{e}_3 & = \frac{\mathbf{u}_3 }{\|\mathbf{u}_3\|} \\
\mathbf{u}_4 & = \mathbf{v}_4-\operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_4)-\operatorname{proj}_{\mathbf{u}_2} (\mathbf{v}_4)-\operatorname{proj}_{\mathbf{u}_3} (\mathbf{v}_4),
& \!\mathbf{e}_4 & = {\mathbf{u}_4 \over \|\mathbf{u}_4\|} \\
& {}\ \  \vdots & & {}\ \  \vdots \\
\mathbf{u}_k & = \mathbf{v}_k - \sum_{j=1}^{k-1}\operatorname{proj}_{\mathbf{u}_j} (\mathbf{v}_k),
& \!\mathbf{e}_k & = \frac{\mathbf{u}_k}{\|\mathbf{u}_k\|}.
\end{align}</math>

The sequence <math>\mathbf{u}_1, \ldots, \mathbf{u}_k</math> is the required system of orthogonal vectors, and the normalized vectors <math>\mathbf{e}_1, \ldots, \mathbf{e}_k</math> form an [[orthonormal set]]. The calculation of the sequence <math>\mathbf{u}_1, \ldots, \mathbf{u}_k</math> is known as ''Gram–Schmidt [[orthogonalization]]'', and the calculation of the sequence <math>\mathbf{e}_1, \ldots, \mathbf{e}_k</math> is known as ''Gram–Schmidt [[orthonormalization]]''.

To check that these formulas yield an orthogonal sequence, first compute <math>\langle \mathbf{u}_1, \mathbf{u}_2 \rangle</math> by substituting the above formula for <math>\mathbf{u}_2</math>: we get zero. Then use this to compute <math>\langle \mathbf{u}_1, \mathbf{u}_3 \rangle</math> again by substituting the formula for <math>\mathbf{u}_3</math>: we get zero. For arbitrary <math>k</math> the proof is accomplished by [[mathematical induction]].

Geometrically, this method proceeds as follows: to compute <math>\mathbf{u}_i</math>, it projects <math>\mathbf{v}_i</math> orthogonally onto the subspace <math>U</math> generated by <math>\mathbf{u}_1, \ldots, \mathbf{u}_{i-1}</math>, which is the same as the subspace generated by <math>\mathbf{v}_1, \ldots, \mathbf{v}_{i-1}</math>. The vector <math>\mathbf{u}_i</math> is then defined to be the difference between <math>\mathbf{v}_i</math> and this projection, guaranteed to be orthogonal to all of the vectors in the subspace <math>U</math>.

The Gram–Schmidt process also applies to a linearly independent [[countably infinite]] sequence {{math|{'''v'''<sub>''i''</sub>}<sub>''i''</sub>}}. The result is an orthogonal (or orthonormal) sequence {{math|{'''u'''<sub>''i''</sub>}<sub>''i''</sub>}} such that for natural number {{mvar|n}}: the algebraic span of <math>\mathbf{v}_1, \ldots, \mathbf{v}_{n}</math> is the same as that of <math>\mathbf{u}_1, \ldots, \mathbf{u}_{n}</math>.

If the Gram–Schmidt process is applied to a linearly dependent sequence, it outputs the {{math|'''0'''}} vector on the <math>i</math>th step, assuming that <math>\mathbf{v}_i</math> is a linear combination of <math>\mathbf{v}_1, \ldots, \mathbf{v}_{i-1}</math>. If an orthonormal basis is to be produced, then the algorithm should test for zero vectors in the output and discard them because no multiple of a zero vector can have a length of 1. The number of vectors output by the algorithm will then be the dimension of the space spanned by the original inputs.

A variant of the Gram–Schmidt process using [[transfinite recursion]] applied to a (possibly uncountably) infinite sequence of vectors <math>(v_\alpha)_{\alpha<\lambda}</math> yields a set of orthonormal vectors <math>(u_\alpha)_{\alpha<\kappa}</math> with <math>\kappa\leq\lambda</math> such that for any <math>\alpha\leq\lambda</math>, the [[Complete space#Completion|completion]] of the span of <math>\{ u_\beta : \beta<\min(\alpha,\kappa) \}</math> is the same as that of {{nowrap|<math>\{ v_\beta : \beta < \alpha\}</math>.}} In particular, when applied to a (algebraic) basis of a [[Hilbert space]] (or, more generally, a basis of any dense subspace), it yields a (functional-analytic) orthonormal basis. Note that in the general case often the strict inequality <math>\kappa < \lambda</math> holds, even if the starting set was linearly independent, and the span of <math>(u_\alpha)_{\alpha<\kappa}</math> need not be a subspace of the span of <math>(v_\alpha)_{\alpha<\lambda}</math> (rather, it's a subspace of its completion).

== Example ==

===Euclidean space===
Consider the following set of vectors in <math>\mathbb{R}^2</math> (with the conventional [[Inner product space#Euclidean vector space|inner product]])
<math display="block">S = \left\{\mathbf{v}_1=\begin{bmatrix} 3 \\ 1\end{bmatrix}, \mathbf{v}_2=\begin{bmatrix}2 \\2\end{bmatrix}\right\}.</math>

Now, perform Gram–Schmidt, to obtain an orthogonal set of vectors:
<math display="block">\mathbf{u}_1=\mathbf{v}_1=\begin{bmatrix}3\\1\end{bmatrix}</math>
<math display="block"> \mathbf{u}_2 = \mathbf{v}_2 - \operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_2)
= \begin{bmatrix}2\\2\end{bmatrix} - \operatorname{proj}_{\left[\begin{smallmatrix}3 \\ 1\end{smallmatrix}\right]} {\begin{bmatrix}2\\2\end{bmatrix}}
= \begin{bmatrix}2\\2\end{bmatrix} - \frac{8}{10} \begin{bmatrix} 3 \\1 \end{bmatrix}
= \begin{bmatrix} -2/5 \\6/5 \end{bmatrix}. </math>

We check that the vectors <math>\mathbf{u}_1</math> and <math>\mathbf{u}_2</math> are indeed orthogonal:
<math display="block">\langle\mathbf{u}_1,\mathbf{u}_2\rangle = \left\langle \begin{bmatrix}3\\1\end{bmatrix}, \begin{bmatrix} -2/5 \\ 6/5 \end{bmatrix} \right\rangle = -\frac{6}{5} + \frac{6}{5} = 0,</math>
noting that if the [[dot product]] of two vectors is 0 then they are orthogonal.

For non-zero vectors, we can then normalize the vectors by dividing out their sizes as shown above:
<math display="block">\mathbf{e}_1 = \frac{1}{\sqrt {10}}\begin{bmatrix}3\\1\end{bmatrix}</math>
<math display="block">\mathbf{e}_2 = \frac{1}{\sqrt{40 \over 25}} \begin{bmatrix}-2/5\\6/5\end{bmatrix}
= \frac{1}{\sqrt{10}} \begin{bmatrix}-1\\3\end{bmatrix}. </math>

== Properties ==

Denote by <math> \operatorname{GS}(\mathbf{v}_1, \dots, \mathbf{v}_k) </math> the result of applying the Gram–Schmidt process to a collection of vectors <math> \mathbf{v}_1, \dots, \mathbf{v}_k </math>. This yields a map <math> \operatorname{GS} \colon (\R^n)^{k} \to (\R^n)^{k} </math>.

It has the following properties:

* It is continuous
* It is [[orientation (vector space)|orientation]] preserving in the sense that <math> \operatorname{or}(\mathbf{v}_1,\dots,\mathbf{v}_k) = \operatorname{or}(\operatorname{GS}(\mathbf{v}_1,\dots,\mathbf{v}_k)) </math>.
* It commutes with orthogonal maps:

Let <math> g \colon \R^n \to \R^n </math> be orthogonal (with respect to the given inner product). Then we have
<math display="block"> \operatorname{GS}(g(\mathbf{v}_1),\dots,g(\mathbf{v}_k)) = \left( g(\operatorname{GS}(\mathbf{v}_1,\dots,\mathbf{v}_k)_1),\dots,g(\operatorname{GS}(\mathbf{v}_1,\dots,\mathbf{v}_k)_k) \right) </math>

Further, a parametrized version of the Gram–Schmidt process yields a (strong) [[Retraction (topology)#Deformation retract and strong deformation retract|deformation retraction]] of the general linear group <math> \mathrm{GL}(\R^n)</math> onto the orthogonal group <math> O(\R^n)</math>.

== Numerical stability ==
When this process is implemented on a computer, the vectors <math>\mathbf{u}_k</math> are often not quite orthogonal, due to [[round-off error|rounding errors]]. For the Gram–Schmidt process as described above (sometimes referred to as "classical Gram–Schmidt") this loss of orthogonality is particularly bad; therefore, it is said that the (classical) Gram–Schmidt process is [[numerical stability|numerically unstable]].

The Gram–Schmidt process can be stabilized by a small modification; this version is sometimes referred to as '''modified Gram-Schmidt''' or MGS. This approach gives the same result as the original formula in exact arithmetic and introduces smaller errors in finite-precision arithmetic.

Instead of computing the vector {{math|'''u'''<sub>''k''</sub>}} as
<math display="block"> \mathbf{u}_k = \mathbf{v}_k - \operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_k) - \operatorname{proj}_{\mathbf{u}_2} (\mathbf{v}_k) - \cdots - \operatorname{proj}_{\mathbf{u}_{k-1}} (\mathbf{v}_k), </math>
it is computed as
<math display="block"> \begin{align}
\mathbf{u}_k^{(1)} &= \mathbf{v}_k - \operatorname{proj}_{\mathbf{u}_1} (\mathbf{v}_k), \\
\mathbf{u}_k^{(2)} &= \mathbf{u}_k^{(1)} - \operatorname{proj}_{\mathbf{u}_2} \left(\mathbf{u}_k^{(1)}\right), \\
& \;\; \vdots \\
\mathbf{u}_k^{(k-2)} &= \mathbf{u}_k^{(k-3)} - \operatorname{proj}_{\mathbf{u}_{k-2}} \left(\mathbf{u}_k^{(k-3)}\right), \\
\mathbf{u}_k^{(k-1)} &= \mathbf{u}_k^{(k-2)} - \operatorname{proj}_{\mathbf{u}_{k-1}} \left(\mathbf{u}_k^{(k-2)}\right), \\
\mathbf{e}_k &=  \frac{\mathbf{u}_k^{(k-1)}}{\left\|\mathbf{u}_k^{(k-1)}\right\|}
\end{align} </math>

This method is used in the previous animation, when the intermediate <math>\mathbf{v}'_3</math> vector is used when orthogonalizing the blue vector <math>\mathbf{v}_3</math>.

Here is another description of the modified algorithm. Given the vectors <math>\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_n</math>, in our first step we produce vectors <math>\mathbf{v}_1, \mathbf{v}_2^{(1)}, \dots, \mathbf{v}_n^{(1)}</math>by removing components along the direction of <math>\mathbf{v}_1</math>. In formulas, <math>\mathbf{v}_k^{(1)} := \mathbf{v}_k - \frac{\langle \mathbf{v}_k, \mathbf{v}_1 \rangle}{\langle \mathbf{v}_1, \mathbf{v}_1 \rangle} \mathbf{v}_1</math>. After this step we already have two of our desired orthogonal vectors <math>\mathbf{u}_1, \dots, \mathbf{u}_n</math>, namely <math>\mathbf{u}_1 = \mathbf{v}_1, \mathbf{u}_2 = \mathbf{v}_2^{(1)}</math>, but we also made <math>\mathbf{v}_3^{(1)}, \dots, \mathbf{v}_n^{(1)}</math> already orthogonal to <math>\mathbf{u}_1</math>. Next, we orthogonalize those remaining vectors against <math>\mathbf{u}_2 = \mathbf{v}_2^{(1)}</math>. This means we compute <math>\mathbf{v}_3^{(2)}, \mathbf{v}_4^{(2)}, \dots, \mathbf{v}_n^{(2)}</math> by subtraction <math>\mathbf{v}_k^{(2)} := \mathbf{v}_k^{(1)} - \frac{\langle \mathbf{v}_k^{(1)}, \mathbf{u}_2 \rangle}{\langle \mathbf{u}_2, \mathbf{u}_2 \rangle} \mathbf{u}_2</math>. Now we have stored the vectors <math>\mathbf{v}_1, \mathbf{v}_2^{(1)}, \mathbf{v}_3^{(2)}, \mathbf{v}_4^{(2)}, \dots, \mathbf{v}_n^{(2)}</math> where the first three vectors are already <math>\mathbf{u}_1, \mathbf{u}_2, \mathbf{u}_3</math> and the remaining vectors are already orthogonal to <math>\mathbf{u}_1, \mathbf{u}_2</math>. As should be clear now, the next step orthogonalizes <math>\mathbf{v}_4^{(2)}, \dots, \mathbf{v}_n^{(2)}</math> against <math>\mathbf{u}_3 = \mathbf{v}_3^{(2)}</math>. Proceeding in this manner we find the full set of orthogonal vectors <math>\mathbf{u}_1, \dots, \mathbf{u}_n</math>. If orthonormal vectors are desired, then we normalize as we go, so that the denominators in the subtraction formulas turn into ones.

== Algorithm ==
The following [[MATLAB]] algorithm implements classical Gram–Schmidt orthonormalization. The vectors {{math|'''v'''<sub>1</sub>, ..., '''v'''<sub>''k''</sub>}} (columns of matrix <code>V</code>, so that <code>V(:,j)</code> is the <math>j</math>th vector) are replaced by orthonormal vectors (columns of <code>U</code>) which span the same subspace.

<syntaxhighlight lang="matlab" line="1">
function U = gramschmidt(V)
    [n, k] = size(V);
    U = zeros(n,k);
    U(:,1) = V(:,1) / norm(V(:,1));
    for i = 2:k
        U(:,i) = V(:,i);
        for j = 1:i-1
            U(:,i) = U(:,i) - (U(:,j)'*U(:,i)) * U(:,j);
        end
        U(:,i) = U(:,i) / norm(U(:,i));
    end
end
</syntaxhighlight>

The cost of this algorithm is asymptotically {{math|O(''nk''<sup>2</sup>)}} floating point operations, where {{mvar|n}} is the dimensionality of the vectors.{{sfn|Golub|Van Loan|1996|loc=§5.2.8}}

== Via Gaussian elimination ==

If the rows {{math|{'''v'''<sub>1</sub>, ..., '''v'''<sub>''k''</sub>}<nowiki/>}} are written as a matrix <math>A</math>, then applying [[Gaussian elimination]] to the augmented matrix <math>\left[A A^\mathsf{T} | A \right]</math> will produce the orthogonalized vectors in place of <math>A</math>. However the matrix <math>A A^\mathsf{T}</math> must be brought to [[row echelon form]], using only the [[Elementary matrix|row operation]] of adding a scalar multiple of one row to another.<ref>{{cite journal|last1=Pursell|first1=Lyle|last2=Trimble|first2=S. Y.|title=Gram-Schmidt Orthogonalization by Gauss Elimination |journal=The American Mathematical Monthly|date=1 January 1991|volume=98|issue=6|pages=544–549| doi=10.2307/2324877 |jstor=2324877}}</ref> For example, taking <math>\mathbf{v}_1 = \begin{bmatrix} 3 & 1\end{bmatrix}, \mathbf{v}_2=\begin{bmatrix}2 & 2\end{bmatrix}</math> as above, we have
<math display="block">\left[A A^\mathsf{T} | A \right] = \left[\begin{array}{rr|rr} 10 & 8 & 3 & 1 \\ 8 & 8 & 2 & 2\end{array}\right]</math>

And reducing this to [[row echelon form]] produces
<math display="block">\left[\begin{array}{rr|rr} 1 & .8 & .3 & .1 \\ 0 & 1 & -.25 & .75\end{array}\right]</math>

The normalized vectors are then
<math display="block">\mathbf{e}_1 = \frac{1}{\sqrt {.3^2+.1^2}}\begin{bmatrix}.3 & .1\end{bmatrix} = \frac{1}{\sqrt{10}} \begin{bmatrix}3 & 1\end{bmatrix}</math>
<math display="block">\mathbf{e}_2 = \frac{1}{\sqrt{.25^2+.75^2}} \begin{bmatrix}-.25 & .75\end{bmatrix} = \frac{1}{\sqrt{10}} \begin{bmatrix}-1 & 3\end{bmatrix}, </math>
as in the example above.

== Determinant formula ==
The result of the Gram–Schmidt process may be expressed in a non-recursive formula using [[determinant]]s.

<math display="block"> \mathbf{e}_j = \frac{1}{\sqrt{D_{j-1} D_j}} \begin{vmatrix}
\langle \mathbf{v}_1, \mathbf{v}_1 \rangle     & \langle \mathbf{v}_2, \mathbf{v}_1 \rangle     & \cdots & \langle \mathbf{v}_j, \mathbf{v}_1 \rangle \\
\langle \mathbf{v}_1, \mathbf{v}_2 \rangle     & \langle \mathbf{v}_2, \mathbf{v}_2 \rangle     & \cdots & \langle \mathbf{v}_j, \mathbf{v}_2 \rangle \\
\vdots                                         & \vdots                                         & \ddots & \vdots \\
\langle \mathbf{v}_1, \mathbf{v}_{j-1} \rangle & \langle \mathbf{v}_2, \mathbf{v}_{j-1} \rangle & \cdots & \langle \mathbf{v}_j, \mathbf{v}_{j-1} \rangle \\
\mathbf{v}_1                                   & \mathbf{v}_2                                   & \cdots & \mathbf{v}_j
\end{vmatrix} </math>

<math display="block"> \mathbf{u}_j = \frac{1}{D_{j-1} } \begin{vmatrix}
\langle \mathbf{v}_1, \mathbf{v}_1 \rangle     & \langle \mathbf{v}_2, \mathbf{v}_1 \rangle     & \cdots & \langle \mathbf{v}_j, \mathbf{v}_1 \rangle \\
\langle \mathbf{v}_1, \mathbf{v}_2 \rangle     & \langle \mathbf{v}_2, \mathbf{v}_2 \rangle     & \cdots & \langle \mathbf{v}_j, \mathbf{v}_2 \rangle \\
\vdots                                         & \vdots                                         & \ddots & \vdots \\
\langle \mathbf{v}_1, \mathbf{v}_{j-1} \rangle & \langle \mathbf{v}_2, \mathbf{v}_{j-1} \rangle & \cdots & \langle \mathbf{v}_j, \mathbf{v}_{j-1} \rangle \\
\mathbf{v}_1                                   & \mathbf{v}_2                                   & \cdots & \mathbf{v}_j
\end{vmatrix} </math>

where <math>D_0 = 1</math> and, for <math>j \ge 1</math>, <math>D_j</math> is the [[Gram determinant]]

<math display="block"> D_j = \begin{vmatrix}
\langle \mathbf{v}_1, \mathbf{v}_1 \rangle & \langle \mathbf{v}_2, \mathbf{v}_1 \rangle & \cdots & \langle \mathbf{v}_j, \mathbf{v}_1 \rangle \\
\langle \mathbf{v}_1, \mathbf{v}_2 \rangle & \langle \mathbf{v}_2, \mathbf{v}_2 \rangle & \cdots & \langle \mathbf{v}_j, \mathbf{v}_2 \rangle \\
\vdots & \vdots & \ddots & \vdots \\
\langle \mathbf{v}_1, \mathbf{v}_j \rangle & \langle \mathbf{v}_2, \mathbf{v}_j \rangle & \cdots & \langle \mathbf{v}_j, \mathbf{v}_j \rangle
\end{vmatrix}.  </math>

Note that the expression for <math>\mathbf{u}_k</math> is a "formal" determinant, i.e. the matrix contains both scalars and vectors; the meaning of this expression is defined to be the result of a [[Laplace expansion|cofactor expansion]] along the row of vectors.

The determinant formula for the Gram-Schmidt is computationally (exponentially) slower than the recursive algorithms described above; it is mainly of theoretical interest.

== Expressed using geometric algebra ==
Expressed using notation used in [[geometric algebra]], the unnormalized results of the Gram–Schmidt process can be expressed as
<math display="block">\mathbf{u}_k = \mathbf{v}_k - \sum_{j=1}^{k-1} (\mathbf{v}_k \cdot \mathbf{u}_j)\mathbf{u}_j^{-1}\ ,</math>
which is equivalent to the expression using the <math>\operatorname{proj}</math> operator defined above. The results can equivalently be expressed as<ref>{{cite book |last1=Doran |first1=Chris |last2=Lasenby |first2=Anthony |title=Geometric Algebra for Physicists |publisher=Cambridge University Press |year=2007 |isbn=978-0-521-71595-9 |page=124 }}</ref>
<math display="block">\mathbf{u}_k = \mathbf{v}_{k}\wedge\mathbf{v}_{k-1}\wedge\cdot\cdot\cdot\wedge\mathbf{v}_{1}(\mathbf{v}_{k-1}\wedge\cdot\cdot\cdot\wedge\mathbf{v}_{1})^{-1},</math>
which is closely related to the expression using determinants above.

== Alternatives ==
Other [[orthogonalization]] algorithms use [[Householder transformation]]s or [[Givens rotation]]s. The algorithms using Householder transformations are more stable than the stabilized Gram–Schmidt process. On the other hand, the Gram–Schmidt process produces the <math>j</math>th orthogonalized vector after the <math>j</math>th iteration, while orthogonalization using [[Householder reflection]]s produces all the vectors only at the end. This makes only the Gram–Schmidt process applicable for [[iterative method]]s like the [[Arnoldi iteration]].

Yet another alternative is motivated by the use of [[Cholesky decomposition]] for [[Ordinary least squares|inverting the matrix of the normal equations in linear least squares]]. Let <math>V</math> be a [[full rank|full column rank]] matrix, whose columns need to be orthogonalized. The matrix <math>V^* V </math> is [[Hermitian matrix|Hermitian]] and [[Positive definite matrix|positive definite]], so it can be written as <math> V^* V = L L^*, </math> using the [[Cholesky decomposition]]. The lower triangular matrix <math>L </math> with strictly positive diagonal entries is [[invertible]]. Then columns of the matrix <math>U = V\left(L^{-1}\right)^*</math> are [[orthonormal]] and [[linear span|span]] the same subspace as the columns of the original matrix <math>V</math>. The explicit use of the product <math>V^* V </math> makes the algorithm unstable, especially if the product's [[condition number]] is large. Nevertheless, this algorithm is used in practice and implemented in some software packages because of its high efficiency and simplicity.

In [[quantum mechanics]] there are several orthogonalization schemes with characteristics better suited for certain applications than original Gram–Schmidt. Nevertheless, it remains a popular and effective algorithm for even the largest electronic structure calculations.<ref>{{cite book|last1=Pursell|first1=Yukihiro|title=Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis |chapter=First-principles calculations of electron states of a silicon nanowire with 100,000 atoms on the K computer | display-authors=etal|date=2011|pages=1:1–1:11| doi=10.1145/2063384.2063386 | isbn=9781450307710|s2cid=14316074}}</ref>

== Run-time complexity ==
Gram-Schmidt orthogonalization can be done in [[strongly-polynomial time]]. The run-time analysis is similar to that of [[Gaussian elimination]].<ref name=":0">{{Cite Geometric Algorithms and Combinatorial Optimization}}</ref>{{Rp|page=40}}

==See also==
* [[Linear algebra]]
* [[Recursion]]
* [[Orthogonality (mathematics)]]

== References ==
<references />
== Notes ==
<references group="note"/>

== Sources ==
* {{Citation | last1=Bau III | first1=David | last2=Trefethen | first2=Lloyd N. | author2-link=Lloyd N. Trefethen | title=Numerical linear algebra | publisher=Society for Industrial and Applied Mathematics | location=Philadelphia | isbn=978-0-89871-361-9 | year=1997}}.
* {{Citation | last1=Golub | first1=Gene H. | author1-link=Gene H. Golub | last2=Van Loan | first2=Charles F. | author2-link=Charles F. Van Loan | title=Matrix Computations | publisher=Johns Hopkins | edition=3rd | isbn=978-0-8018-5414-9 | year=1996}}.
* {{Citation | last1=Greub | first1=Werner | title=Linear Algebra | publisher = Springer | edition=4th |year = 1975}}.
* {{Citation | last1=Soliverez | first1=C. E. | last2=Gagliano | first2=E. | url=http://rmf.smf.mx/pdf/rmf/31/4/31_4_743.pdf | title=Orthonormalization on the plane: a geometric approach | journal=Mex. J. Phys. | volume=31 | number=4 | pages=743–758 | year=1985 | access-date=2013-06-22 | archive-date=2014-03-07 | archive-url=https://web.archive.org/web/20140307095009/http://rmf.smf.mx/pdf/rmf/31/4/31_4_743.pdf | url-status=dead }}.

== External links ==
{{Portal|Mathematics}}
* {{springer|title=Orthogonalization|id=p/o070420}}
* [https://web.archive.org/web/20160402140129/https://www.math.hmc.edu/calculus/tutorials/gramschmidt/gramschmidt.pdf Harvey Mudd College Math Tutorial on the Gram-Schmidt algorithm]
* [http://jeff560.tripod.com/g.html Earliest known uses of some of the words of mathematics: G] The entry "Gram-Schmidt orthogonalization" has some information and references on the origins of the method.
* Demos: [https://bigsigma.com/linear-algebra/gram-schmidt-process/#plain Gram Schmidt process in plane] and [https://bigsigma.com/linear-algebra/gram-schmidt-process/#space Gram Schmidt process in space]
* [https://www.math.ucla.edu/~tao/resource/general/115a.3.02f/GramSchmidt.html Gram-Schmidt orthogonalization applet]
* [http://www.nag.co.uk/numeric/fl/nagdoc_fl24/html/F05/f05conts.html NAG Gram–Schmidt orthogonalization of n vectors of order m routine]
* Proof: [http://planetmath.org/ProofOfGramSchmidtOrthogonalizationProcedure Raymond Puzio, Keenan Kidwell. "proof of Gram-Schmidt orthogonalization algorithm" (version 8). PlanetMath.org.]

{{linear algebra}}

{{DEFAULTSORT:Gram-Schmidt Process}}
[[Category:Linear algebra]]
[[Category:Functional analysis]]
[[Category:Articles with example MATLAB/Octave code]]