Editing Projection (linear algebra) (section)

=====Formulas=====
A simple case occurs when the orthogonal projection is onto a line. If <math>\mathbf u</math> is a [[unit vector]] on the line, then the projection is given by the [[outer product]]
<math display="block"> P_\mathbf{u} = \mathbf u \mathbf u^\mathsf{T}.</math>
(If <math>\mathbf u</math> is complex-valued, the transpose in the above equation is replaced by a Hermitian transpose). This operator leaves '''u''' invariant, and it annihilates all vectors orthogonal to <math>\mathbf u</math>, proving that it is indeed the orthogonal projection onto the line containing '''u'''.<ref>Meyer, p. 431</ref> A simple way to see this is to consider an arbitrary vector <math>\mathbf x</math> as the sum of a component on the line (i.e. the projected vector we seek) and another perpendicular to it, <math>\mathbf x = \mathbf x_\parallel + \mathbf x_\perp</math>. Applying projection, we get
<math display="block">
  P_{\mathbf u} \mathbf x =
  \mathbf u \mathbf u^\mathsf{T} \mathbf x_\parallel + \mathbf u \mathbf u^\mathsf{T} \mathbf x_\perp =
  \mathbf u \left( \sgn\left(\mathbf u^\mathsf{T} \mathbf x_\parallel\right) \left \| \mathbf x_\parallel \right \| \right) + \mathbf u \cdot \mathbf 0 = \mathbf x_\parallel
</math>
by the properties of the [[dot product]] of parallel and perpendicular vectors.

This formula can be generalized to orthogonal projections on a subspace of arbitrary [[dimension (vector space)|dimension]]. Let <math>\mathbf u_1, \ldots, \mathbf u_k</math> be an [[orthonormal basis]] of the subspace <math>U</math>, with the assumption that the integer <math>k \geq 1</math>, and let <math>A</math> denote the <math>n \times k</math> matrix whose columns are <math>\mathbf u_1, \ldots, \mathbf u_k</math>, i.e., <math>A = \begin{bmatrix} \mathbf u_1 & \cdots & \mathbf u_k \end{bmatrix}</math>. Then the projection is given by:<ref>Meyer, equation (5.13.4)</ref>
<math display="block">P_A = A A^\mathsf{T}</math>
which can be rewritten as
<math display="block">P_A = \sum_i \langle \mathbf u_i, \cdot \rangle \mathbf u_i.</math>

The matrix <math>A^\mathsf{T}</math> is the [[partial isometry]] that vanishes on the [[orthogonal complement]] of <math>U</math>, and <math>A</math> is the isometry that embeds <math>U</math> into the underlying vector space. The range of <math>P_A</math> is therefore the ''final space'' of <math>A</math>. It is also clear that <math>A A^{\mathsf T}</math> is the identity operator on <math>U</math>.

The orthonormality condition can also be dropped. If <math>\mathbf u_1, \ldots, \mathbf u_k</math> is a (not necessarily orthonormal) [[basis (linear algebra)|basis]] with <math>k \geq 1</math>, and <math>A</math> is the matrix with these vectors as columns, then the projection is:<ref>{{Citation | last1 = Banerjee | first1 = Sudipto | last2 = Roy | first2 = Anindya | date = 2014 | title = Linear Algebra and Matrix Analysis for Statistics | series = Texts in Statistical Science | publisher = Chapman and Hall/CRC | edition =  1st | isbn =  978-1420095388 | url=https://books.google.com/books?id=iIOhAwAAQBAJ&q=projection}}</ref><ref>Meyer, equation (5.13.3)</ref>
<math display="block">P_A = A \left(A^\mathsf{T} A\right)^{-1} A^\mathsf{T}.</math>

The matrix <math>A</math> still embeds <math>U</math> into the underlying vector space but is no longer an isometry in general. The matrix <math>\left(A^\mathsf{T}A\right)^{-1}</math> is a "normalizing factor" that recovers the norm. For example, the [[rank of a linear operator|rank]]-1 operator <math>\mathbf u \mathbf u^\mathsf{T}</math> is not a projection if <math>\left\|\mathbf u \right\| \neq 1.</math> After dividing by <math>\mathbf u^\mathsf{T} \mathbf u = \left\| \mathbf u \right\|^2,</math> we obtain the projection <math>\mathbf u \left(\mathbf u^\mathsf{T} \mathbf u \right)^{-1} \mathbf u^\mathsf{T}</math> onto the subspace spanned by <math>u</math>.

In the general case, we can have an arbitrary [[positive definite]] matrix <math>D</math> defining an inner product <math>\langle x, y \rangle_D = y^\dagger Dx</math>, and the projection <math>P_A</math> is given by <math display="inline">P_A x = \operatorname{argmin}_{y \in \operatorname{range}(A)} \left\|x - y\right\|^2_D</math>. Then
<math display="block">P_A = A \left(A^\mathsf{T} D A\right)^{-1} A^\mathsf{T} D.</math>

When the range space of the projection is generated by a [[Frame of a vector space|frame]] (i.e. the number of generators is greater than its dimension), the formula for the projection takes the form: <math>P_A = A A^+</math>. Here <math>A^+</math> stands for the [[Moore–Penrose pseudoinverse]]. This is just one of many ways to construct the projection operator.

If <math>\begin{bmatrix} A & B \end{bmatrix}</math> is a non-singular matrix and <math>A^\mathsf{T}B = 0</math> (i.e., <math>B</math> is the [[null space]] matrix of <math>A</math>),<ref>See also [[Linear least squares (mathematics)#Properties of the least-squares estimators|Linear least squares (mathematics) § Properties of the least-squares estimators]].</ref> the following holds: 
<math display="block">\begin{align}
I &= \begin{bmatrix} A & B \end{bmatrix}
\begin{bmatrix} A & B \end{bmatrix}^{-1}\begin{bmatrix} A^\mathsf{T} \\ B^\mathsf{T} \end{bmatrix}^{-1}
\begin{bmatrix} A^\mathsf{T} \\ B^\mathsf{T} \end{bmatrix} \\
  &= \begin{bmatrix} A & B \end{bmatrix}
\left(
\begin{bmatrix} A^\mathsf{T} \\ B^\mathsf{T} \end{bmatrix}
\begin{bmatrix} A & B \end{bmatrix}
\right )^{-1}
\begin{bmatrix} A^\mathsf{T} \\B^\mathsf{T} \end{bmatrix} \\
  &= \begin{bmatrix} A & B \end{bmatrix} \begin{bmatrix}A^\mathsf{T}A&O\\O&B^\mathsf{T}B\end{bmatrix}^{-1}
\begin{bmatrix} A^\mathsf{T} \\ B^\mathsf{T} \end{bmatrix}\\[4pt]
  &= A \left(A^\mathsf{T}A\right)^{-1} A^\mathsf{T} + B \left(B^\mathsf{T}B\right)^{-1} B^\mathsf{T}
\end{align}</math>

If the orthogonal condition is enhanced to <math>A^\mathsf{T}W B = A^\mathsf{T}W^\mathsf{T}B = 0</math> with <math>W</math> non-singular<!-- and symmetric-->, the following holds:
<math display="block">I = \begin{bmatrix}A & B\end{bmatrix} \begin{bmatrix}\left(A^\mathsf{T} W A\right)^{-1} A^\mathsf{T} \\ \left(B^\mathsf{T} W B\right)^{-1} B^\mathsf{T} \end{bmatrix} W.</math>

All these formulas also hold for complex inner product spaces, provided that the [[conjugate transpose]] is used instead of the transpose. Further details on sums of projectors can be found in Banerjee and Roy (2014).<ref>{{Citation | last1 = Banerjee | first1 = Sudipto | last2 = Roy | first2 = Anindya | date = 2014 | title = Linear Algebra and Matrix Analysis for Statistics | series = Texts in Statistical Science | publisher = Chapman and Hall/CRC | edition =  1st | isbn =  978-1420095388 | url=https://books.google.com/books?id=iIOhAwAAQBAJ&q=projection}}</ref> Also see Banerjee (2004)<ref>{{Citation | last = Banerjee | first = Sudipto | date = 2004 | title = Revisiting Spherical Trigonometry with Orthogonal Projectors | journal = The College Mathematics Journal | volume = 35 | issue = 5 | pages =  375–381 | doi=10.1080/07468342.2004.11922099 | s2cid = 122277398 }}</ref> for application of sums of projectors in basic [[spherical trigonometry]].