Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Moore–Penrose inverse
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Properties== ===Existence and uniqueness=== As discussed above, for any matrix {{tmath| A }} there is one and only one pseudoinverse {{tmath| A^+ }}.<ref name="GvL1996"/> A matrix satisfying only the first of the conditions given above, namely <math display="inline">A A^+ A = A</math>, is known as a generalized inverse. If the matrix also satisfies the second condition, namely <math display="inline">A^+ A A^+ = A^+</math>, it is called a [[generalized inverse#Types of generalized inverses|generalized ''reflexive'' inverse]]. Generalized inverses always exist but are not in general unique. Uniqueness is a consequence of the last two conditions. ===Basic properties=== Proofs for the properties below can be found at [[b:Topics in Abstract Algebra/Linear algebra]]. * If {{tmath| A }} has real entries, then so does {{tmath| A^+ }}. * If {{tmath| A }} is [[invertible matrix|invertible]], its pseudoinverse is its inverse. That is, <math>A^+ = A^{-1}</math>.<ref name="SB2002">{{Cite book | last1=Stoer | first1=Josef | last2=Bulirsch | first2=Roland | title=Introduction to Numerical Analysis | publisher=[[Springer-Verlag]] | location=Berlin, New York | edition=3rd | isbn=978-0-387-95452-3 | year=2002}}.</ref>{{rp|243}} * The pseudoinverse of the pseudoinverse is the original matrix: <math>\left(A^+\right)^+ = A</math>.<ref name="SB2002" />{{rp|245}} * Pseudoinversion commutes with transposition, complex conjugation, and taking the conjugate transpose:<ref name="SB2002" />{{rp|245}} <!-- reference only mentions the last bit --> <math display="block">\left(A^\mathsf{T}\right)^+ = \left(A^+\right)^\mathsf{T}, \quad \left(\overline{A}\right)^+ = \overline{A^+}, \quad \left(A^*\right)^+ = \left(A^+\right)^* .</math> * The pseudoinverse of a scalar multiple of {{tmath| A }} is the reciprocal multiple of {{tmath| A^+ }}:<math display="block">\left(\alpha A\right)^+ = \alpha^{-1} A^+</math> for {{tmath| \alpha \neq 0 }}; otherwise, <math>\left(0 A\right)^+ = 0 A^+ = 0 A^\mathsf{T}</math>, or <math>0^+=0^\mathsf{T}</math>. * The kernel and image of the pseudoinverse coincide with those of the conjugate transpose: <math>\ker\left(A^+\right) = \ker\left(A^*\right)</math> and <math>\operatorname{ran}\left(A^+\right) = \operatorname{ran}\left(A^*\right)</math>. ====Identities==== The following identity formula can be used to cancel or expand certain subexpressions involving pseudoinverses: <math display="block"> A = {}A{}A^*{}A^{+*}{} = {}A^{+*}{}A^*{}A. </math> Equivalently, substituting <math>A^+</math> for <math>A</math> gives <math display="block"> A^+ ={}A^+{}A^{+*}{}A^*{} = {}A^*{}A^{+*}{}A^+, </math> while substituting <math>A^*</math> for <math>A</math> gives <math display="block"> A^* ={}A^*{}A{}A^+{}={}A^+{}A{}A^*. </math> ===Reduction to Hermitian case=== The computation of the pseudoinverse is reducible to its construction in the Hermitian case. This is possible through the equivalences: <math display="block">A^+ = \left(A^*A\right)^+ A^*,</math> <math display="block">A^+ = A^* \left(A A^*\right)^+,</math> as {{tmath| A^*A }} and {{tmath| A A^* }} are Hermitian. ===Pseudoinverse of products=== The equality {{tmath|1= (AB)^+ = B^+ A^+ }} does not hold in general. Rather, suppose {{tmath| A \in \mathbb{K}^{m\times n},\ B \in \mathbb{K}^{n\times p} }}. Then the following are equivalent:<ref>{{Cite journal|last=Greville|first=T. N. E.|date=1966-10-01|title=Note on the Generalized Inverse of a Matrix Product|url=https://epubs.siam.org/doi/10.1137/1008107|journal=SIAM Review|volume=8|issue=4|pages=518–521|doi=10.1137/1008107|bibcode=1966SIAMR...8..518G |issn=0036-1445}}</ref> # <math display="inline">(AB)^+ = B^+ A^+</math> # <math>A^+ A BB^* A^* = BB^* A^* </math> and <math>BB^+ A^* A B = A^* A B</math> # <math display="inline">\left(A^+ A BB^*\right)^* = A^+ A BB^*</math> and <math>\left(A^* A BB^+\right)^* = A^* A BB^+</math> # <math display="inline">A^+ A BB^* A^* A BB^+ = BB^* A^* A</math> # <math display="inline">A^+ A B = B (AB)^+ AB </math> and <math>BB^+ A^* = A^* A B (AB)^+</math>. The following are sufficient conditions for {{tmath|1= (AB)^+ = B^+ A^+ }}: # {{tmath| A }} has orthonormal columns (then <math>A^*A = A^+ A = I_n</math>), or # {{tmath| B }} has orthonormal rows (then <math>BB^* = BB^+ = I_n</math>), or # {{tmath| A }} has linearly independent columns (then <math>A^+ A = I</math> ) and {{tmath| B }} has linearly independent rows (then <math>BB^+ = I</math>), or # <math>B = A^*</math>, or # <math>B = A^+</math>. The following is a necessary condition for {{tmath|1= (AB)^+ = B^+ A^+ }}: # <math>(A^+ A) (BB^+) = (BB^+) (A^+ A)</math> The fourth sufficient condition yields the equalities <math display="block">\begin{align} \left(A A^*\right)^+ &= A^{+*} A^+, \\ \left(A^* A\right)^+ &= A^+ A^{+*}. \end{align}</math> Here is a counterexample where {{tmath|1= (AB)^+ \neq B^+ A^+ }}: <math display="block">\Biggl( \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} 0 & 0 \\ 1 & 1 \end{pmatrix} \Biggr)^+ = \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix}^+ = \begin{pmatrix} \tfrac12 & 0 \\ \tfrac12 & 0 \end{pmatrix} \quad \neq \quad \begin{pmatrix} \tfrac14 & 0 \\ \tfrac14 & 0 \end{pmatrix} = \begin{pmatrix} 0 & \tfrac12 \\ 0 & \tfrac12 \end{pmatrix} \begin{pmatrix} \tfrac12 & 0 \\ \tfrac12 & 0 \end{pmatrix} = \begin{pmatrix} 0 & 0 \\ 1 & 1 \end{pmatrix}^+ \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix}^+</math> ===Projectors=== <math>P = A A^+</math> and <math>Q = A^+A</math> are [[projection (linear algebra)|orthogonal projection operators]], that is, they are Hermitian (<math>P = P^*</math>, <math>Q = Q^*</math>) and idempotent (<math>P^2 = P</math> and <math>Q^2 = Q</math>). The following hold: * <math>PA = AQ = A</math> and <math>A^+ P = QA^+ = A^+</math> * {{tmath| P }} is the [[orthogonal projector]] onto the [[range of a function|range]] of {{tmath| A }} (which equals the [[orthogonal complement]] of the kernel of {{tmath| A^* }}). * {{tmath| Q }} is the orthogonal projector onto the range of {{tmath| A^* }} (which equals the orthogonal complement of the kernel of {{tmath| A }}). * <math>I - Q = I - A^+A</math> is the orthogonal projector onto the kernel of {{tmath| A }}. * <math>I - P = I - A A^+</math> is the orthogonal projector onto the kernel of {{tmath| A^* }}.<ref name="GvL1996"/> The last two properties imply the following identities: * <math>A\,\ \left(I - A^+ A\right)= \left(I - A A^+\right)A\ \ = 0</math> * <math>A^*\left(I - A A^+\right) = \left(I - A^+A\right)A^* = 0</math> Another property is the following: if {{tmath| A \in \mathbb{K}^{n\times n} }} is Hermitian and idempotent (true if and only if it represents an orthogonal projection), then, for any matrix {{tmath| B\in \mathbb{K}^{m\times n} }} the following equation holds:<ref>{{cite journal|first1=Anthony A.|last1=Maciejewski|first2=Charles A.|last2=Klein|title=Obstacle Avoidance for Kinematically Redundant Manipulators in Dynamically Varying Environments|journal=International Journal of Robotics Research|volume=4|issue=3|pages=109–117|year=1985|doi=10.1177/027836498500400308|hdl=10217/536|s2cid=17660144|hdl-access=free}}</ref> <math display="block">A(BA)^+ = (BA)^+</math> This can be proven by defining matrices <math>C = BA</math>, <math>D = A(BA)^+</math>, and checking that {{tmath| D }} is indeed a pseudoinverse for {{tmath| C }} by verifying that the defining properties of the pseudoinverse hold, when {{tmath| A }} is Hermitian and idempotent. From the last property it follows that, if {{tmath| A \in \mathbb{K}^{n\times n} }} is Hermitian and idempotent, for any matrix {{tmath| B \in \mathbb{K}^{n\times m} }} <math display="block">(AB)^+A = (AB)^+</math> Finally, if {{tmath| A }} is an orthogonal projection matrix, then its pseudoinverse trivially coincides with the matrix itself, that is, <math>A^+ = A</math>. ===Geometric construction=== If we view the matrix as a linear map {{tmath| A:\mathbb{K}^n \to \mathbb{K}^m }} over the field {{tmath| \mathbb{K} }} then {{tmath| A^+: \mathbb{K}^m \to \mathbb{K}^n }} can be decomposed as follows. We write {{tmath| \oplus }} for the [[direct sum of modules|direct sum]], {{tmath| \perp }} for the [[orthogonal complement]], {{tmath| \ker }} for the [[kernel (linear algebra)|kernel]] of a map, and {{tmath| \operatorname{ran} }} for the image of a map. Notice that <math>\mathbb{K}^n = \left(\ker A\right)^\perp \oplus \ker A</math> and <math>\mathbb{K}^m = \operatorname{ran} A \oplus \left(\operatorname{ran} A\right)^\perp</math>. The restriction <math> A: \left(\ker A\right)^\perp \to \operatorname{ran} A</math> is then an isomorphism. This implies that {{tmath| A^+ }} on {{tmath| \operatorname{ran} A }} is the inverse of this isomorphism, and is zero on <math>\left(\operatorname{ran} A\right)^\perp .</math> In other words: To find {{tmath| A^+b }} for given {{tmath| b }} in {{tmath| \mathbb{K}^m }}, first project {{tmath| b }} orthogonally onto the range of {{tmath| A }}, finding a point {{tmath| p(b) }} in the range. Then form {{tmath| A^{-1}(\{p(b)\}) }}, that is, find those vectors in {{tmath| \mathbb{K}^n }} that {{tmath| A }} sends to {{tmath| p(b) }}. This will be an affine subspace of {{tmath| \mathbb{K}^n }} parallel to the kernel of {{tmath| A }}. The element of this subspace that has the smallest length (that is, is closest to the origin) is the answer {{tmath| A^+b }} we are looking for. It can be found by taking an arbitrary member of {{tmath| A^{-1}(\{p(b)\}) }} and projecting it orthogonally onto the orthogonal complement of the kernel of {{tmath| A }}. This description is closely related to the [[#Minimum norm solution to a linear system|minimum-norm solution to a linear system]]. ===Limit relations=== The pseudoinverse are limits: <math display="block">A^+ = \lim_{\delta \searrow 0} \left(A^* A + \delta I\right)^{-1} A^* = \lim_{\delta \searrow 0} A^* \left(A A^* + \delta I\right)^{-1} </math> (see [[Tikhonov regularization]]). These limits exist even if {{tmath| \left(A A^*\right)^{-1} }} or {{tmath| \left(A^*A\right)^{-1} }} do not exist.<ref name="GvL1996"/>{{rp|263}}<ref>{{cite journal | title = The Moore–Penrose Pseudoinverse: A Tutorial Review of the Theory | date = 2012 | doi = 10.1007/s13538-011-0052-z | arxiv = 1110.6882 | last1 = Barata | first1 = João Carlos Alves | last2 = Hussein | first2 = Mahir Saleh | journal = Brazilian Journal of Physics | volume = 42 | issue = 1–2 | pages = 146–165 | bibcode = 2012BrJPh..42..146B }}</ref> ===Continuity=== In contrast to ordinary matrix inversion, the process of taking pseudoinverses is not [[continuous function|continuous]]: if the sequence {{tmath| \left(A_n\right) }} converges to the matrix {{tmath| A }} (in the [[matrix norm|maximum norm or Frobenius norm]], say), then {{tmath| (A_n)^+ }} need not converge to {{tmath| A^+ }}. However, if all the matrices {{tmath| A_n}} have the same rank as {{tmath| A }}, {{tmath| (A_n)^+ }} will converge to {{tmath| A^+ }}.<ref name="rakocevic1997">{{cite journal | last=Rakočević | first=Vladimir | title=On continuity of the Moore–Penrose and Drazin inverses | journal=Matematički Vesnik | volume=49 | pages=163–72 | year=1997 | url =http://elib.mi.sanu.ac.rs/files/journals/mv/209/mv973404.pdf }}</ref> ===Derivative=== Let <math>x \mapsto A(x)</math> be a real-valued differentiable matrix function with constant rank in a neighborhood of a point {{tmath| x_0 }}. The derivative of <math>x \mapsto A^+(x)</math> at <math>x_0</math> may be calculated in terms of the derivative of <math>A</math> at <math>x_0</math>:<ref>{{cite journal|title=The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems Whose Variables Separate|first1=G. H.|last1=Golub |author-link=Gene H. Golub |first2=V.|last2=Pereyra|journal=SIAM Journal on Numerical Analysis|volume=10|number=2|date=April 1973|pages=413–32|jstor=2156365|doi=10.1137/0710036|bibcode=1973SJNA...10..413G}}</ref> <math display="block"> \left.\frac{\mathrm d}{\mathrm d x}\right|_{x = x_0\!\!\!\!\!\!\!} A^+ = -A^+ \left( \frac{\mathrm{d} A}{\mathrm d x} \right) A^+ ~+~ A^+ A^{+\top} \left(\frac{\mathrm{d} A^\top}{\mathrm{d} x} \right) \left(I - A A^+\right) ~+~ \left(I - A^+ A\right) \left(\frac{\mathrm{d} A^\top}{\mathrm{d} x} \right) A^{+\top} A^+, </math> where the functions <math>A</math>, <math>A^+</math> and derivatives on the right side are evaluated at <math>x_0</math> (that is, <math>A := A(x_0)</math>, <math>A^+ := A^+(x_0)</math>, etc.). For a complex matrix, the transpose is replaced with the conjugate transpose.<ref>{{cite book |last1=Hjørungnes |first1=Are |title=Complex-valued matrix derivatives: with applications in signal processing and communications |date=2011 |publisher=Cambridge university press |location=New York |isbn=9780521192644 |page=52}}</ref> For a real-valued symmetric matrix, the [[Magnus-Neudecker derivative]] is established.<ref>{{Cite journal| last1=Liu|first1=Shuangzhe| last2= Trenkler|first2=Götz| last3=Kollo|first3=Tõnu| last4=von Rosen|first4=Dietrich| last5=Baksalary|first5=Oskar Maria| date= 2023| title= Professor Heinz Neudecker and matrix differential calculus| journal= Statistical Papers|volume=65 |issue=4 |pages=2605–2639 | language=en| doi= 10.1007/s00362-023-01499-w}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)