Editing Singular value decomposition (section)

=== Based on the spectral theorem ===
Let <math>\mathbf{M}</math> be an {{tmath|m \times n}} complex matrix. Since <math>\mathbf{M}^* \mathbf{M}</math> is positive semi-definite and Hermitian, by the [[spectral theorem]], there exists an {{tmath|n \times n}} unitary matrix <math>\mathbf{V}</math> such that

<math display=block>
\mathbf V^* \mathbf M^* \mathbf M \mathbf V
= \bar\mathbf{D}
= \begin{bmatrix} \mathbf{D} & 0 \\ 0 & 0\end{bmatrix},
</math>

where <math>\mathbf{D}</math> is diagonal and positive definite, of dimension <math>\ell\times \ell</math>, with <math>\ell</math> the number of non-zero eigenvalues of <math>\mathbf{M}^* \mathbf{M}</math> (which can be shown to verify <math>\ell\le\min(n,m)</math>). Note that <math>\mathbf{V}</math> is here by definition a matrix whose <math>i</math>-th column is the <math>i</math>-th eigenvector of <math>\mathbf{M}^* \mathbf{M}</math>, corresponding to the eigenvalue <math>\bar{\mathbf{D}}_{ii}</math>. Moreover, the <math>j</math>-th column of <math>\mathbf{V}</math>, for <math>j>\ell</math>, is an eigenvector of <math>\mathbf{M}^* \mathbf{M}</math> with eigenvalue <math>\bar{\mathbf{D}}_{jj}=0</math>. This can be expressed by writing <math>\mathbf{V}</math>  as <math>\mathbf{V}=\begin{bmatrix}\mathbf{V}_1 &\mathbf{V}_2\end{bmatrix}</math>, where the columns of <math>\mathbf{V}_1</math> and <math>\mathbf{V}_2</math> therefore contain the eigenvectors of <math>\mathbf{M}^* \mathbf{M}</math> corresponding to non-zero and zero eigenvalues, respectively. Using this rewriting of <math>\mathbf{V}</math>, the equation becomes:

<math display=block>
\begin{bmatrix} \mathbf{V}_1^* \\ \mathbf{V}_2^* \end{bmatrix}
\mathbf{M}^* \mathbf{M}\, \begin{bmatrix} \mathbf{V}_1 & \!\! \mathbf{V}_2 \end{bmatrix}
= \begin{bmatrix}
  \mathbf{V}_1^* \mathbf{M}^* \mathbf{M} \mathbf{V}_1 & \mathbf{V}_1^* \mathbf{M}^* \mathbf{M} \mathbf{V}_2 \\
  \mathbf{V}_2^* \mathbf{M}^* \mathbf{M} \mathbf{V}_1 & \mathbf{V}_2^* \mathbf{M}^* \mathbf{M} \mathbf{V}_2
\end{bmatrix}
= \begin{bmatrix} \mathbf{D} & 0 \\ 0 & 0 \end{bmatrix}.</math>

This implies that

<math display=block>
\mathbf{V}_1^* \mathbf{M}^* \mathbf{M} \mathbf{V}_1
= \mathbf{D}, \quad \mathbf{V}_2^* \mathbf{M}^* \mathbf{M} \mathbf{V}_2
= \mathbf{0}.
</math>

Moreover, the second equation implies <math>\mathbf{M}\mathbf{V}_2 = \mathbf{0}</math>.<ref>To see this, we just have to notice that <math>\operatorname{Tr}(\mathbf{V}_2^* \mathbf{M}^* \mathbf{M} \mathbf{V}_2) = \|\mathbf{M} \mathbf{V}_2\|^2</math>, and remember that <math>\|A\| = 0 \Leftrightarrow A = 0</math>.</ref> Finally, the unitary-ness of <math>\mathbf{V}</math> translates, in terms of <math>\mathbf{V}_1</math> and <math>\mathbf{V}_2</math>, into the following conditions:

<math display=block>\begin{align} 
\mathbf{V}_1^* \mathbf{V}_1 &= \mathbf{I}_1, \\
\mathbf{V}_2^* \mathbf{V}_2 &= \mathbf{I}_2, \\
\mathbf{V}_1 \mathbf{V}_1^* + \mathbf{V}_2 \mathbf{V}_2^* &= \mathbf{I}_{12},
\end{align}</math>

where the subscripts on the identity matrices are used to remark that they are of different dimensions.

Let us now define

<math display=block>
\mathbf{U}_1 = \mathbf{M} \mathbf{V}_1 \mathbf{D}^{-\frac{1}{2}}.
</math>

Then,

<math display=block>
\mathbf{U}_1 \mathbf{D}^\frac{1}{2} \mathbf{V}_1^* = \mathbf{M} \mathbf{V}_1 \mathbf{D}^{-\frac{1}{2}} \mathbf{D}^\frac{1}{2} \mathbf{V}_1^* = \mathbf{M} (\mathbf{I} - \mathbf{V}_2\mathbf{V}_2^*) = \mathbf{M} - (\mathbf{M}\mathbf{V}_2)\mathbf{V}_2^* = \mathbf{M},
</math>

since <math>\mathbf{M}\mathbf{V}_2 = \mathbf{0}. </math> This can be also seen as immediate consequence of the fact that <math>\mathbf{M}\mathbf{V}_1\mathbf{V}_1^* = \mathbf{M}</math>. This is equivalent to the observation that if <math>\{\boldsymbol v_i\}_{i=1}^\ell</math> is the set of eigenvectors of <math>\mathbf{M}^* \mathbf{M}</math> corresponding to non-vanishing eigenvalues <math>\{\lambda_i\}_{i=1}^\ell</math>, then <math>\{\mathbf M \boldsymbol v_i\}_{i=1}^\ell</math> is a set of orthogonal vectors, and <math>\bigl\{\lambda_i^{-1/2}\mathbf M \boldsymbol v_i\bigr\}\vphantom|_{i=1}^\ell</math> is a (generally not complete) set of ''orthonormal'' vectors. This matches with the matrix formalism used above denoting with <math>\mathbf{V}_1</math> the matrix whose columns are <math>\{\boldsymbol v_i\}_{i=1}^\ell</math>, with <math>\mathbf{V}_2</math> the matrix whose columns are the eigenvectors of <math>\mathbf{M}^* \mathbf{M}</math> with vanishing eigenvalue, and <math>\mathbf{U}_1</math> the matrix whose columns are the vectors <math>\bigl\{\lambda_i^{-1/2}\mathbf M \boldsymbol v_i\bigr\}\vphantom|_{i=1}^\ell</math>.

We see that this is almost the desired result, except that <math>\mathbf{U}_1</math> and <math>\mathbf{V}_1</math> are in general not unitary, since they might not be square. However, we do know that the number of rows of <math>\mathbf{U}_1</math> is no smaller than the number of columns, since the dimensions of <math>\mathbf{D}</math> is no greater than <math>m</math> and <math>n</math>. Also, since

<math display=block>
\mathbf{U}_1^*\mathbf{U}_1 = \mathbf{D}^{-\frac{1}{2}}\mathbf{V}_1^*\mathbf{M}^*\mathbf{M} \mathbf{V}_1 \mathbf{D}^{-\frac{1}{2}}=\mathbf{D}^{-\frac{1}{2}}\mathbf{D}\mathbf{D}^{-\frac{1}{2}} = \mathbf{I_1},
</math>

the columns in <math>\mathbf{U}_1</math> are orthonormal and can be extended to an orthonormal basis. This means that we can choose <math>\mathbf{U}_2</math> such that <math>\mathbf{U} = \begin{bmatrix} \mathbf{U}_1 & \mathbf{U}_2 \end{bmatrix}</math> is unitary.

For {{tmath|\mathbf V_1}} we already have {{tmath|\mathbf V_2}} to make it unitary. Now, define

<math display=block>
\mathbf \Sigma =
\begin{bmatrix}
  \begin{bmatrix} \mathbf{D}^\frac{1}{2} & 0 \\ 0 & 0 \end{bmatrix} \\
  0
\end{bmatrix},
</math>

where extra zero rows are added '''or removed''' to make the number of zero rows equal the number of columns of {{tmath|\mathbf U_2,}} and hence the overall dimensions of <math>\mathbf \Sigma</math> equal to <math>m\times n</math>. Then

<math display=block>
\begin{bmatrix} \mathbf{U}_1 & \mathbf{U}_2 \end{bmatrix}
\begin{bmatrix}
  \begin{bmatrix} \mathbf{}D^\frac{1}{2} & 0 \\ 0 & 0 \end{bmatrix} \\
  0 \end{bmatrix}
\begin{bmatrix} \mathbf{V}_1 & \mathbf{V}_2 \end{bmatrix}^*
= \begin{bmatrix} \mathbf{U}_1 & \mathbf{U}_2 \end{bmatrix}
\begin{bmatrix} \mathbf{D}^\frac{1}{2} \mathbf{V}_1^* \\ 0 \end{bmatrix}
= \mathbf{U}_1 \mathbf{D}^\frac{1}{2} \mathbf{V}_1^* = \mathbf{M},
</math>

which is the desired result:

<math display=block>
\mathbf{M} = \mathbf{U} \mathbf \Sigma \mathbf{V}^*.
</math>

Notice the argument could begin with diagonalizing {{tmath|\mathbf M \mathbf M^*}} rather than {{tmath|\mathbf M^* \mathbf M}} (This shows directly that {{tmath|\mathbf M \mathbf M^*}} and {{tmath|\mathbf M^* \mathbf M}} have the same non-zero eigenvalues).