Editing Gram matrix

{{short description|Matrix of inner products of a set of vectors}}
In [[linear algebra]], the '''Gram matrix''' (or '''Gramian matrix''', '''Gramian''') of a set of vectors <math>v_1,\dots, v_n</math> in an [[inner product space]] is the [[Hermitian matrix]] of [[inner product]]s, whose entries are given by the [[inner product]] <math>G_{ij} = \left\langle v_i, v_j \right\rangle</math>.<ref name="HJ-7.2.10">{{harvnb|Horn|Johnson|2013|p=441}}, p.441, Theorem 7.2.10</ref> If the vectors <math>v_1,\dots, v_n</math> are the columns of matrix <math>X</math> then the Gram matrix is <math>X^\dagger X</math> in the general case that the vector coordinates are complex numbers, which simplifies to <math>X^\top X</math> for the case that the vector coordinates are real numbers.

An important application is to compute [[linear independence]]: a set of vectors are linearly independent if and only if the [[#Gram determinant|Gram determinant]] (the [[determinant]] of the Gram matrix) is non-zero.

It is named after [[Jørgen Pedersen Gram]].

==Examples==
For finite-dimensional real vectors in <math>\mathbb{R}^n</math> with the usual Euclidean [[dot product]], the Gram matrix is <math>G = V^\top V</math>, where <math>V</math> is a matrix whose columns are the vectors <math>v_k</math> and <math>V^\top</math> is its [[transpose]] whose rows are the vectors <math>v_k^\top</math>. For [[complex number|complex]] vectors in <math>\mathbb{C}^n</math>, <math>G = V^\dagger V</math>, where <math>V^\dagger</math> is the [[conjugate transpose]] of <math>V</math>.

Given [[square-integrable function]]s <math>\{\ell_i(\cdot),\, i = 1,\dots,n\}</math> on the interval <math>\left[t_0, t_f\right]</math>, the Gram matrix <math>G = \left[G_{ij}\right]</math> is:

: <math>G_{ij} = \int_{t_0}^{t_f} \ell_i^*(\tau)\ell_j(\tau)\, d\tau. </math>
where <math>\ell_i^*(\tau)</math> is the [[complex conjugate]] of <math>\ell_i(\tau)</math>.

For any [[bilinear form]] <math>B</math> on a [[finite-dimensional]] [[vector space]] over any [[Field (mathematics)|field]] we can define a Gram matrix <math>G</math> attached to a set of vectors <math>v_1, \dots, v_n</math> by <math>G_{ij} = B\left(v_i, v_j\right)</math>. The matrix will be symmetric if the bilinear form <math>B</math> is symmetric.

===Applications===
* In [[Riemannian geometry]], given an embedded <math>k</math>-dimensional [[Riemannian manifold]] <math>M\subset \mathbb{R}^n</math> and a parametrization <math>\phi: U\to M</math> for {{nowrap|<math>(x_1, \ldots, x_k)\in U\subset\mathbb{R}^k</math>,}} the volume form <math>\omega</math> on <math>M</math> induced by the embedding may be computed using the Gramian of the coordinate tangent vectors: <math display="block">\omega = \sqrt{\det G}\ dx_1 \cdots dx_k,\quad G = \left[\left\langle \frac{\partial\phi}{\partial x_i},\frac{\partial\phi}{\partial x_j}\right\rangle\right].</math> This generalizes the classical surface integral of a parametrized surface <math>\phi:U\to S\subset \mathbb{R}^3</math> for <math>(x, y)\in U\subset\mathbb{R}^2</math>: <math display="block">\int_S f\ dA = \iint_U f(\phi(x, y))\, \left|\frac{\partial\phi}{\partial x}\,{\times}\,\frac{\partial\phi}{\partial y}\right|\, dx\, dy.</math>
* If the vectors are centered [[random variable]]s, the Gramian is approximately proportional to the '''[[covariance matrix]]''', with the scaling determined by the number of elements in the vector.
* In [[quantum chemistry]], the Gram matrix of a set of [[basis vectors]] is the '''[[overlap matrix]]'''.
* In [[control theory]] (or more generally [[systems theory]]), the '''[[controllability Gramian]]''' and '''[[observability Gramian]]''' determine properties of a linear system.
* Gramian matrices arise in covariance structure model fitting (see e.g., Jamshidian and Bentler, 1993, Applied Psychological Measurement, Volume 18, pp.&nbsp;79–94).
* In the [[finite element method]], the Gram matrix arises from approximating a function from a finite dimensional space; the Gram matrix entries are then the inner products of the basis functions of the finite dimensional subspace.
* In [[machine learning]], [[kernel function]]s are often represented as Gram matrices.<ref>{{cite journal |last1=Lanckriet |first1=G. R. G. |first2=N. |last2=Cristianini |first3=P. |last3=Bartlett |first4=L. E. |last4=Ghaoui |first5=M. I. |last5=Jordan |title=Learning the kernel matrix with semidefinite programming |journal=Journal of Machine Learning Research |volume=5 |year=2004 |pages=27–72 [p. 29] |url=https://dl.acm.org/citation.cfm?id=894170 }}</ref> (Also see [[kernel principal component analysis|kernel PCA]])
* Since the Gram matrix over the reals is a [[symmetric matrix]], it is [[diagonalizable]] and its [[eigenvalues]] are non-negative. The diagonalization of the Gram matrix is the [[singular value decomposition]].

==Properties==
===Positive-semidefiniteness===
The Gram matrix is [[symmetric matrix|symmetric]] in the case the inner product is real-valued; it is [[Hermitian matrix|Hermitian]] in the general, complex case by definition of an [[inner product]].

The Gram matrix is [[Positive-semidefinite matrix|positive semidefinite]], and every positive semidefinite matrix is the Gramian matrix for some set of vectors. The fact that the Gramian matrix is positive-semidefinite can be seen from the following simple derivation:
: <math>
  x^\dagger \mathbf{G} x =
  \sum_{i,j}x_i^* x_j\left\langle v_i, v_j \right\rangle =
  \sum_{i,j}\left\langle x_i v_i, x_j v_j \right\rangle =
  \biggl\langle \sum_i x_i v_i, \sum_j x_j v_j \biggr\rangle =
  \biggl\| \sum_i x_i v_i \biggr\|^2 \geq 0 .
</math>

The first equality follows from the definition of matrix multiplication, the second and third from the bi-linearity of the [[inner-product]], and the last from the positive definiteness of the inner product.
Note that this also shows that the Gramian matrix is positive definite if and only if the vectors <math> v_i </math> are linearly independent (that is, <math display="inline">\sum_i x_i v_i \neq 0</math> for all <math>x</math>).<ref name="HJ-7.2.10"/>

===Finding a vector realization===
{{See also|Positive definite matrix#Decomposition}}
Given any positive semidefinite matrix <math>M</math>, one can decompose it as:
: <math>M = B^\dagger B</math>,

where <math>B^\dagger</math> is the [[conjugate transpose]] of <math>B</math> (or <math>M = B^\textsf{T} B</math> in the real case).

Here <math>B</math> is a <math>k \times n</math> matrix, where <math>k</math> is the [[matrix rank|rank]] of <math>M</math>. Various ways to obtain such a decomposition include computing the [[Cholesky decomposition]] or taking the [[square root of a matrix|non-negative square root]] of <math>M</math>.

The columns <math>b^{(1)}, \dots, b^{(n)}</math> of <math>B</math> can be seen as ''n'' vectors in <math>\mathbb{C}^k</math> (or ''k''-dimensional Euclidean space <math>\mathbb{R}^k</math>, in the real case). Then
: <math>M_{ij} = b^{(i)} \cdot b^{(j)}</math>

where the [[dot product]] <math display="inline">a \cdot b = \sum_{\ell=1}^k a_\ell^* b_\ell</math> is the usual inner product on <math>\mathbb{C}^k</math>.

Thus a [[Hermitian matrix]] <math>M</math> is positive semidefinite if and only if it is the Gram matrix of some vectors <math>b^{(1)}, \dots, b^{(n)}</math>. Such vectors are called a '''vector realization''' of {{nowrap|<math>M</math>.}} The infinite-dimensional analog of this statement is [[Mercer's theorem]].

===Uniqueness of vector realizations===
If <math>M</math> is the Gram matrix of vectors <math>v_1,\dots,v_n</math> in <math>\mathbb{R}^k</math> then applying any rotation or reflection of <math>\mathbb{R}^k</math> (any [[orthogonal transformation]], that is, any [[Euclidean isometry]] preserving 0) to the sequence of vectors results in the same Gram matrix. That is, for any <math>k \times k</math> [[orthogonal matrix]] <math>Q</math>, the Gram matrix of <math>Q v_1,\dots, Q v_n</math> is also {{nowrap|<math>M</math>.}}

This is the only way in which two real vector realizations of <math>M</math> can differ: the vectors <math>v_1,\dots,v_n</math> are unique up to [[orthogonal transformation]]s. In other words, the dot products <math>v_i \cdot v_j</math> and <math>w_i \cdot w_j</math> are equal if and only if some rigid transformation of <math>\mathbb{R}^k</math> transforms the vectors <math>v_1,\dots,v_n</math> to <math>w_1, \dots, w_n</math> and 0 to 0.

The same holds in the complex case, with [[unitary transformation]]s in place of orthogonal ones.
That is, if the Gram matrix of vectors <math>v_1, \dots, v_n</math> is equal to the Gram matrix of vectors <math>w_1, \dots, w_n</math> in <math>\mathbb{C}^k</math> then there is a [[unitary matrix|unitary]] <math>k \times k</math> matrix <math>U</math> (meaning <math>U^\dagger U = I</math>) such that <math>v_i = U w_i</math> for <math>i = 1, \dots, n</math>.<ref>{{harvtxt|Horn|Johnson|2013}}, p. 452, Theorem 7.3.11</ref>

===Other properties===
* Because <math>G = G^\dagger</math>, it is necessarily the case that <math>G</math> and <math>G^\dagger</math> commute.  That is, a real or complex Gram matrix <math>G</math> is also a [[normal matrix]].
* The Gram matrix of any [[orthonormal basis]] is the identity matrix.  Equivalently, the Gram matrix of the rows or the columns of a real [[rotation matrix]] is the identity matrix.  Likewise, the Gram matrix of the rows or columns of a [[unitary matrix]] is the identity matrix.
* The rank of the Gram matrix of vectors in <math>\mathbb{R}^k</math> or <math>\mathbb{C}^k</math> equals the dimension of the space [[Linear span|spanned]] by these vectors.<ref name="HJ-7.2.10"/>

==Gram determinant==
The '''Gram determinant''' or '''Gramian''' is the determinant of the Gram matrix:
<math display=block>\bigl|G(v_1, \dots, v_n)\bigr| = \begin{vmatrix}
  \langle v_1,v_1\rangle & \langle v_1,v_2\rangle &\dots & \langle v_1,v_n\rangle \\
  \langle v_2,v_1\rangle & \langle v_2,v_2\rangle &\dots & \langle v_2,v_n\rangle \\
  \vdots & \vdots & \ddots & \vdots \\
  \langle v_n,v_1\rangle & \langle v_n,v_2\rangle &\dots & \langle v_n,v_n\rangle
\end{vmatrix}.</math>

If <math>v_1, \dots, v_n</math> are vectors in <math>\mathbb{R}^m</math> then it is the square of the ''n''-dimensional volume of the [[Parallelepiped#Parallelotope|parallelotope]] formed by the vectors. In particular, the vectors are [[Linear independence|linearly independent]] [[if and only if]] the parallelotope has nonzero ''n''-dimensional volume, if and only if Gram determinant is nonzero, if and only if the Gram matrix is [[Non-singular matrix|nonsingular]]. When {{nowrap|''n'' > ''m''}} the determinant and volume are zero.  When {{nowrap|1=''n'' = ''m''}}, this reduces to the standard theorem that the absolute value of the determinant of ''n'' ''n''-dimensional vectors is the ''n''-dimensional volume. The volume of the [[simplex]] formed by the vectors is {{math|Volume(parallelotope) / ''n''!}}.

When <math>v_1, \dots, v_n</math> are linearly independent, the distance between a point <math>x</math> and the linear span of <math>v_1, \dots, v_n</math> is <math>\sqrt{\frac{|G(x,v_1, \dots, v_n)|}{|G(v_1, \dots, v_n)|}}</math>.

Consider the moment problem: given <math>c_1, \dots, c_n \in \mathbb C</math>, find a vector <math>v</math> such that <math display="inline">\left\langle v, v_i\right\rangle=c_i</math>, for all <math display="inline">1 \leqslant i \leqslant n</math>. There exists a unique solution with minimal norm:<ref>{{Cite journal |last1=Ramon |first1=Garcia, Stephan |last2=Javad |first2=Mashreghi |last3=T. |first3=Ross, William |date=2023-01-30 |title=Operator Theory by Example |url=https://academic.oup.com/book/45766 |journal=OUP Academic |language=en |doi=10.1093/o|doi-broken-date=13 April 2025 }}</ref>{{Pg|page=38}}<math display="block">v=-\frac{1}{G\left(v_1, v_2, \ldots, v_n\right)} \det
\begin{bmatrix}
0 & c_1 & c_2 & \cdots & c_n \\
v_1 & \left\langle v_1, v_1\right\rangle & \left\langle v_1, v_2\right\rangle & \cdots & \left\langle v_1, v_n\right\rangle \\
v_2 & \left\langle v_2, v_1\right\rangle & \left\langle v_2, v_2\right\rangle & \cdots & \left\langle v_2, v_n\right\rangle \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
v_n & \left\langle v_n, v_1\right\rangle & \left\langle v_n, v_2\right\rangle & \cdots & \left\langle v_n, v_n\right\rangle
\end{bmatrix}</math>The Gram determinant can also be expressed in terms of the [[exterior product]] of vectors by
:<math>\bigl|G(v_1, \dots, v_n)\bigr| = \| v_1 \wedge \cdots \wedge v_n\|^2.</math>

The Gram determinant therefore supplies an [[exterior product#inner product|inner product]] for the space {{tmath|{\textstyle\bigwedge}^{\!n}(V)}}. If an [[orthonormal basis]] ''e''<sub>''i''</sub>, {{nowrap|1=''i'' = 1, 2, ..., ''n''}} on {{tmath|V}} is given, the vectors 
: <math> e_{i_1} \wedge \cdots \wedge e_{i_n},\quad i_1 < \cdots < i_n, </math>
will constitute an orthonormal basis of ''n''-dimensional volumes on the space {{tmath|{\textstyle\bigwedge}^{\!n}(V)}}. Then the Gram determinant <math>\bigl|G(v_1, \dots, v_n)\bigr|</math> amounts to an ''n''-dimensional [[Pythagorean Theorem#Sets_of_m-dimensional_objects_in_n-dimensional_space|Pythagorean Theorem]] for the volume of the parallelotope formed by the vectors <math>v_1 \wedge \cdots \wedge v_n</math> in terms of its projections onto the basis volumes <math>e_{i_1} \wedge \cdots \wedge e_{i_n}</math>.

When the vectors <math>v_1, \ldots, v_n \in \mathbb{R}^m</math> are defined from the positions of points <math>p_1, \ldots, p_n</math> relative to some reference point <math>p_{n+1}</math>,
:<math display="block">(v_1, v_2, \ldots, v_n) = (p_1 - p_{n+1}, p_2 - p_{n+1}, \ldots, p_n - p_{n+1})\,,</math>
then the Gram determinant can be written as the difference of two Gram determinants,
:<math display=block>
\bigl|G(v_1, \dots, v_n)\bigr| = \bigl|G((p_1, 1), \dots, (p_{n+1}, 1))\bigr| - \bigl|G(p_1, \dots, p_{n+1})\bigr|\,,
</math>
where each <math>(p_j, 1)</math> is the corresponding point <math>p_j</math> supplemented with the coordinate value of 1 for an <math>(m+1)</math>-st dimension.{{Citation needed|reason=This relation between Gram matrices is apparently true but needs a citation to support its [[WP:N|notability]].|date=February 2022}} Note that in the common case that {{math|1=''n'' = ''m''}}, the second term on the right-hand side will be zero.

==Constructing an orthonormal basis==

Given a set of linearly independent vectors <math>\{v_i\}</math> with Gram matrix <math>G</math> defined by <math>G_{ij}:= \langle v_i,v_j\rangle</math>, one can construct an orthonormal basis
:<math>u_i := \sum_j \bigl(G^{-1/2}\bigr)_{ji} v_j.</math>
In matrix notation, <math>U = V G^{-1/2} </math>, where <math>U</math> has orthonormal basis vectors <math>\{u_i\}</math> and the matrix <math>V</math> is composed of the given column vectors <math>\{v_i\}</math>.

The matrix <math>G^{-1/2}</math> is guaranteed to exist. Indeed, <math>G</math> is Hermitian, and so can be decomposed as <math>G=UDU^\dagger</math> with <math>U</math> a unitary matrix and <math>D</math> a real diagonal matrix. Additionally, the <math>v_i</math> are linearly independent if and only if <math>G</math> is positive definite, which implies that the diagonal entries of <math>D</math> are positive. <math>G^{-1/2}</math> is therefore uniquely defined by <math>G^{-1/2}:=UD^{-1/2}U^\dagger</math>. One can check that these new vectors are orthonormal:
:<math>\begin{align}
\langle u_i,u_j \rangle
&= \sum_{i'} \sum_{j'} \Bigl\langle \bigl(G^{-1/2}\bigr)_{i'i} v_{i'},\bigl(G^{-1/2}\bigr)_{j'j} v_{j'} \Bigr\rangle \\[10mu]
&= \sum_{i'} \sum_{j'} \bigl(G^{-1/2}\bigr)_{ii'} G_{i'j'} \bigl(G^{-1/2}\bigr)_{j'j} \\[8mu]
&=  \bigl(G^{-1/2} G G^{-1/2}\bigr)_{ij} = \delta_{ij}
\end{align}</math>
where we used <math>\bigl(G^{-1/2}\bigr)^\dagger=G^{-1/2} </math>.

==See also==
* [[Controllability Gramian]]
* [[Observability Gramian]]

==References==
{{reflist}}
* {{Cite book | last1=Horn | first1=Roger A. | last2=Johnson | first2=Charles R. | title=Matrix Analysis | publisher=[[Cambridge University Press]] | isbn=978-0-521-54823-6 | year=2013 |edition=2nd }}

==External links==
* {{springer|title=Gram matrix|id=p/g044750}}
* ''[https://mathweb.rice.edu/honors-calculus Volumes of parallelograms]'' by Frank Jones

{{Matrix classes}}

[[Category:Systems theory]]
[[Category:Matrices (mathematics)]]
[[Category:Determinants]]
[[Category:Analytic geometry]]
[[Category:Kernel methods for machine learning]]

[[fr:Matrice de Gram]]