Editing Projection (linear algebra) (section)

==== A matrix representation formula for a nonzero projection operator ====
Let <math>P \colon V \to V</math> be a linear operator such that <math>P^2 = P</math> and assume that <math>P</math> is not the zero operator. Let the vectors <math>\mathbf u_1, \ldots, \mathbf u_k</math> form a basis for the range of <math>P</math>, and assemble these vectors in the <math>n \times k</math> matrix <math>A</math>. Then <math>k \geq 1</math>, otherwise <math>k = 0</math> and <math>P</math> is the zero operator. The range and the kernel are complementary spaces, so the kernel has dimension <math>n - k</math>. It follows that the [[orthogonal complement]] of the kernel has dimension <math>k</math>. Let <math>\mathbf v_1, \ldots, \mathbf v_k</math> form a basis for the orthogonal complement of the kernel of the projection, and assemble these vectors in the matrix <math>B</math>. Then the projection <math>P</math> (with the condition <math>k \geq 1</math>) is given by
<math display="block"> P = A \left(B^\mathsf{T} A\right)^{-1} B^\mathsf{T}. </math>

This expression generalizes the formula for orthogonal projections given above.<ref>{{Citation | last1 = Banerjee | first1 = Sudipto | last2 = Roy | first2 = Anindya | date = 2014 | title = Linear Algebra and Matrix Analysis for Statistics | series = Texts in Statistical Science | publisher = Chapman and Hall/CRC | edition =  1st | isbn =  978-1420095388 | url=https://books.google.com/books?id=iIOhAwAAQBAJ&q=projection}}</ref><ref>Meyer, equation (7.10.39)</ref> A standard proof of this expression is the following. For any vector <math>\mathbf x</math> in the vector space <math>V</math>, we can decompose <math>\mathbf{x} = \mathbf{x}_1 + \mathbf{x}_2</math>, where vector <math>\mathbf{x}_1 = P(\mathbf{x})</math> is in the image of <math>P</math>, and vector <math>\mathbf{x}_2 = \mathbf{x} - P(\mathbf{x}).</math> So <math>P(\mathbf{x}_2) = P(\mathbf{x}) - P^2(\mathbf{x})= \mathbf{0}</math>, and then <math>\mathbf{x}_2</math> is in the kernel of <math>P</math>, which is the null space of <math>A.</math> In other words, the vector <math>\mathbf{x}_1</math> is in the column space of <math>A,</math> so <math>\mathbf{x}_1 = A \mathbf{w}</math> for some <math>k</math> dimension vector <math>\mathbf{w}</math> and the vector <math>\mathbf{x}_2</math> satisfies <math>B^\mathsf{T} \mathbf{x}_2=\mathbf{0}</math> by the construction of <math>B</math>. Put these conditions together, and we find a vector <math>\mathbf{w}</math> so that  <math>B^\mathsf{T} (\mathbf{x}-A\mathbf{w})=\mathbf{0}</math>. Since matrices <math>A</math> and <math>B</math> are of full rank <math>k</math> by their construction, the <math>k\times k</math>-matrix <math>B^\mathsf{T} A</math> is invertible. So the equation  <math>B^\mathsf{T} (\mathbf{x}-A\mathbf{w})=\mathbf{0}</math> gives the vector <math>\mathbf{w}= (B^{\mathsf{T}}A)^{-1} B^{\mathsf{T}} \mathbf{x}.</math> In this way, <math>P\mathbf{x} = \mathbf{x}_1 = A\mathbf{w}= A(B^{\mathsf{T}}A)^{-1} B^{\mathsf{T}} \mathbf{x}</math> for any vector <math>\mathbf{x} \in V</math> and hence <math>P = A(B^{\mathsf{T}}A)^{-1} B^{\mathsf{T}}</math>.

In the case that <math>P</math> is an orthogonal projection, we can take <math>A = B</math>, and it follows that <math>P=A \left(A^\mathsf{T} A\right)^{-1} A^\mathsf{T}</math>. By using this formula, one can easily check that <math>P=P^\mathsf{T}</math>. In general, if the vector space is over complex number field, one then uses the [[Hermitian transpose]] <math>A^*</math> and has the formula  <math>P=A \left(A^* A\right)^{-1} A^*</math>. Recall that one can express the [[Moore–Penrose inverse]] of the matrix <math>A</math> by <math>A^{+}= (A^*A)^{-1}A^*</math> since <math>A</math> has full column rank, so <math>P=A A^{+}</math>.