Editing Woodbury matrix identity

{{Use American English|date = January 2019}}
{{Short description|Theorem of matrix ranks}}
In [[mathematics]], specifically [[linear algebra]], the '''Woodbury matrix identity''' – named after [[Max A. Woodbury]]<ref>Max A. Woodbury, ''Inverting modified matrices'', Memorandum Rept. 42, Statistical Research Group, Princeton University, Princeton, NJ, 1950, 4pp {{MR|38136}}</ref><ref>Max A. Woodbury, ''The Stability of Out-Input Matrices''. Chicago, Ill., 1949. 5 pp. {{MR|32564}}</ref> – says that the inverse of a rank-''k'' correction of some [[matrix (mathematics)|matrix]] can be computed by doing a rank-''k'' correction to the inverse of the original matrix. Alternative names for this formula are the '''matrix inversion lemma''', '''Sherman–Morrison–Woodbury formula''' or just '''Woodbury formula'''. However, the identity appeared in several papers before the Woodbury report.<ref name="guttman">{{cite journal
 |first=Louis |last=Guttmann
 |title=Enlargement methods for computing the inverse matrix
 |journal=Ann. Math. Statist.
 |volume=17 |year=1946 |pages=336&ndash;343 |issue=3
 |doi=10.1214/aoms/1177730946
|doi-access=free}}</ref><ref name="hager">{{cite journal
 |first=William W. |last=Hager
 |title=Updating the inverse of a matrix
 |journal=SIAM Review
 |volume=31 |year=1989 |pages=221&ndash;239 |issue=2
 |doi=10.1137/1031049 |mr=997457 | jstor = 2030425
}}</ref>

The Woodbury matrix identity is<ref name="higham">{{Cite book | last1=Higham | first1=Nicholas | author1-link=Nicholas Higham | title=Accuracy and Stability of Numerical Algorithms | url=https://archive.org/details/accuracystabilit00high_878 | url-access=limited | publisher=[[Society for Industrial and Applied Mathematics|SIAM]] | edition=2nd | isbn=978-0-89871-521-7 | year=2002 | page=[https://archive.org/details/accuracystabilit00high_878/page/n288 258] |mr=1927606 }}
</ref>
<math display="block"> \left(A + UCV \right)^{-1} = A^{-1} - A^{-1}U \left(C^{-1} + VA^{-1}U \right)^{-1} VA^{-1}, </math>

where ''A'', ''U'', ''C'' and ''V'' are [[conformable matrix|conformable matrices]]: ''A'' is ''n''×''n'', ''C'' is ''k''×''k'', ''U'' is ''n''×''k'',  and ''V'' is ''k''×''n''. This can be derived using [[invertible matrix#Blockwise inversion|blockwise matrix inversion]].

While the identity is primarily used on matrices, it holds in a general [[ring (mathematics)|ring]] or in an [[Ab-category]].

The Woodbury matrix identity allows cheap computation of inverses and solutions to linear equations. However, little is known about the numerical stability of the formula. There are no published results concerning its error bounds. Anecdotal evidence<ref>
{{cite web 
| url = https://mathoverflow.net/questions/80340/special-considerations-when-using-the-woodbury-matrix-identity-numerically
| title = MathOverflow discussion
| website = MathOverflow
}}
</ref> suggests that it may diverge even for seemingly benign examples (when both the original and modified matrices are [[well-conditioned]]).

== Discussion ==
To prove this result, we will start by proving a simpler one.  Replacing ''A'' and ''C'' with the [[identity matrix]] ''I'', we obtain another identity which  is a bit simpler:
<math display="block"> \left(I + UV \right)^{-1} = I - U \left(I + VU \right)^{-1} V. </math>
To recover the original equation from this ''reduced identity'', replace <math>U</math> by <math>A^{-1}U</math> and <math>V</math> by <math>CV</math>.

This identity itself can be viewed as the combination of two simpler identities. We obtain the first identity from
<math display="block"> I = (I + P)^{-1}(I + P) = (I + P)^{-1} + (I + P)^{-1}P,</math>
thus,
<math display="block"> (I + P)^{-1} = I-(I + P)^{-1}P,</math>
and similarly
<math display="block"> (I + P)^{-1} = I - P (I + P)^{-1}.</math>
The second identity is the so-called '''push-through identity'''<ref name="HS"/>
<math display="block"> (I + UV)^{-1} U = U (I + VU)^{-1} </math>
that we obtain from
<math display="block"> U(I + VU)=(I + UV)U</math>
after multiplying by <math>(I + VU)^{-1}</math> on the right and by <math>(I + UV)^{-1}</math> on the left.

Putting all together,
<math display="block"> \left(I + UV \right)^{-1} = I - UV \left(I + UV \right)^{-1} = I - U \left(I + VU \right)^{-1} V. </math>
where the first and second equality come from the first and second identity, respectively.
=== Special cases ===

When <math>V, U</math> are vectors, the identity reduces to the [[Sherman–Morrison formula]].

In the scalar case, the reduced version is simply
<math display="block">\frac{1}{1 + uv} = 1 - \frac{uv}{1 + vu}.</math>

==== Inverse of a sum ====

If ''n'' = ''k'' and ''U'' = ''V'' = ''I''<sub>''n''</sub> is the identity matrix, then

<math display="block">\begin{align}
\left(A + B\right)^{-1} &= A^{-1} - A^{-1} \left(B^{-1} + A^{-1}\right)^{-1} A^{-1} \\[1ex]
&= A^{-1} - A^{-1} \left(A B^{-1} + {I}\right)^{-1}.
\end{align}</math>

Continuing with the merging of the terms of the far right-hand side of the above equation results in [[Hua's identity]]
<math display="block">\left({A} + {B}\right)^{-1} = {A}^{-1} - \left({A} + {A}{B}^{-1}{A}\right)^{-1}.</math>

Another useful form of the same identity is
<math display="block">\left({A} - {B}\right)^{-1} = {A}^{-1} + {A}^{-1}{B}\left({A} - {B}\right)^{-1},</math>

which, unlike those above, is valid even if <math>B</math> is [[singular matrix|singular]], and has a recursive structure that yields
<math display="block">\left({A} - {B}\right)^{-1} = \sum_{k=0}^{\infty} \left({A}^{-1}{B}\right)^k{A}^{-1}</math>
if the [[spectral radius]] of <math>A^{-1}B</math> is less than one.  That is, if the above sum converges then it is equal to <math>(A-B)^{-1}</math>.

This form can be used in perturbative expansions where ''B'' is a perturbation of ''A''.

=== Variations ===

==== Binomial inverse theorem ====

If ''A'', ''B'', ''U'', ''V'' are matrices of sizes ''n''×''n'', ''k''×''k'', ''n''×''k'', ''k''×''n'', respectively, then
<math display="block">
  \left(A + UBV\right)^{-1} =
  A^{-1} - A^{-1}UB\left(B+BVA^{-1}UB\right)^{-1} BVA^{-1}
</math>

provided ''A'' and ''B'' + ''BVA''<sup>−1</sup>''UB'' are nonsingular. Nonsingularity of the latter requires that ''B''<sup>−1</sup> exist since it equals {{nowrap|''B''(''I'' + ''VA''<sup>−1</sup>''UB'')}} and the rank of the latter cannot exceed the rank of ''B''.<ref name=HS>{{cite journal | last1 = Henderson | first1 = H. V. | last2 = Searle | first2 = S. R. | year = 1981 | title = On deriving the inverse of a sum of matrices | url = http://ecommons.cornell.edu/bitstream/1813/32749/1/BU-647-M.pdf| journal = SIAM Review | volume = 23 | issue = 1 | pages = 53–60 | doi = 10.1137/1023004 | jstor = 2029838 | hdl = 1813/32749 | hdl-access = free }}</ref>

Since ''B'' is invertible, the two ''B'' terms flanking the parenthetical quantity inverse in the right-hand side can be replaced with {{nowrap|(''B''<sup>−1</sup>)<sup>−1</sup>,}} which results in the original Woodbury identity.

A variation for when ''B'' is singular and possibly even non-square:<ref name=HS/>
<math display="block">(A + UBV)^{-1} = A^{-1} - A^{-1}U(I + BVA^{-1}U)^{-1}BVA^{-1}.</math>

Formulas also exist for certain cases in which ''A'' is singular.<ref>Kurt S. Riedel, "A Sherman–Morrison–Woodbury Identity for Rank Augmenting Matrices with Application to Centering", ''SIAM Journal on Matrix Analysis and Applications'', 13 (1992)659-662, {{doi|10.1137/0613040}} [http://math.nyu.edu/mfdd/riedel/ranksiam.ps preprint] {{MR|1152773}}</ref>

==== Pseudoinverse with positive semidefinite matrices ====

In general Woodbury's identity is not valid if one or more inverses are replaced by [[Moore–Penrose inverse|(Moore–Penrose) pseudoinverses]]. However, if <math>A</math> and <math>C</math> are [[Positive semidefinite matrices|positive semidefinite]], and <math>V = U^\mathrm H</math> (implying that <math>A + UCV</math> is itself positive semidefinite), then the following formula provides a generalization:<ref>{{cite book |last1=Bernstein |first1=Dennis S. |title=Scalar, Vector, and Matrix Mathematics: Theory, Facts, and Formulas |date=2018 |publisher=Princeton University Press |location=Princeton |isbn=9780691151205 |page=638 |edition=Revised and expanded}}</ref><ref>{{cite book |last1=Schott |first1=James R. |title=Matrix analysis for statistics |date=2017 |publisher=John Wiley & Sons, Inc. |location=Hoboken, New Jersey |isbn=9781119092483 |page=219 |edition=Third}}</ref>
<math display="block">
\begin{align}
\left(XX^\mathrm H + YY^\mathrm H\right)^+
&= \left(ZZ^\mathrm H\right)^+ + \left(I - YZ^+\right)^\mathrm H X^{+\mathrm H} E X^+ \left(I - YZ^+\right), \\
Z &= \left(I - XX^+\right) Y, \\
E &= I - X^+Y \left(I - Z^+Z\right) F^{-1} \left(X^+Y\right)^\mathrm H, \\
F &= I + \left(I - Z^+Z\right) Y^\mathrm H \left(XX^\mathrm H\right)^+ Y \left(I - Z^+Z\right),
\end{align}
</math>

where <math>A + UCU^\mathrm H</math> can be written as <math>XX^\mathrm H + YY^\mathrm H</math> because any positive semidefinite matrix is equal to <math>MM^\mathrm H</math> for some <math>M</math>.

== Derivations ==

=== Direct proof ===
The formula can be proven by checking that <math>(A + UCV)</math> times its alleged inverse on the right side of the Woodbury identity gives the identity matrix:
<math display="block">\begin{align}
      & \left(A + UCV \right) \left[ A^{-1} - A^{-1}U \left(C^{-1} + VA^{-1}U \right)^{-1} VA^{-1} \right] \\
  ={} & \left\{ I - U\left(C^{-1} + VA^{-1}U \right)^{-1}VA^{-1} \right\} + \left\{ UCVA^{-1} - UCVA^{-1}U \left(C^{-1} + VA^{-1}U \right)^{-1} VA^{-1} \right\} \\
  ={} & \left\{ I + UCVA^{-1} \right\} - \left\{ U\left(C^{-1} + VA^{-1}U \right)^{-1}VA^{-1} + UCVA^{-1}U \left(C^{-1} + VA^{-1}U \right)^{-1} VA^{-1} \right\} \\
  ={} & I + UCVA^{-1} - \left(U + UCVA^{-1}U\right) \left(C^{-1} + VA^{-1}U\right)^{-1}VA^{-1} \\
  ={} & I + UCVA^{-1} - UC \left(C^{-1} + VA^{-1}U\right) \left(C^{-1} + VA^{-1}U\right)^{-1}VA^{-1} \\
  ={} & I + UCVA^{-1} - UCVA^{-1} \\
  ={} & I.
\end{align}</math>

=== Alternative proofs ===

{{collapse top|title=Algebraic proof }}
First consider these useful identities,
<math display="block">\begin{align}
               U + UCV A^{-1} U &= 
               UC \left(C^{-1} + V A^{-1} U\right) = \left(A + UCV\right) A^{-1} U \\
  \left(A + UCV\right)^{-1} U C &= A^{-1}U \left(C^{-1} + VA^{-1} U\right) ^{-1}
\end{align}
</math>

Now, 
<math display="block">\begin{align}
  A^{-1} &= \left(A + UCV\right)^{-1}\left(A + UCV\right) A^{-1}\\
         &= \left(A + UCV\right)^{-1}\left(I + UCVA^{-1}\right) \\
         &= \left(A + UCV\right)^{-1} + \left(A + UCV\right)^{-1} UCVA^{-1} \\ 
         &= \left(A + UCV\right)^{-1} + A^{-1} U \left(C^{-1} + VA^{-1}U \right)^{-1} VA^{-1}.
\end{align}</math>
{{collapse bottom}}

{{collapse top|title=Derivation via blockwise elimination}}
Deriving the Woodbury matrix identity is easily done by solving the following block matrix inversion problem
<math display="block">
  \begin{bmatrix} A & U \\ V & -C^{-1} \end{bmatrix}\begin{bmatrix} X \\ Y \end{bmatrix} = \begin{bmatrix} I \\ 0 \end{bmatrix}.
</math>

Expanding, we can see that the above reduces to
<math display="block">\begin{cases} AX + UY = I \\ VX - C^{-1}Y = 0\end{cases}</math>
which is equivalent to <math>(A + UCV)X = I</math>. Eliminating the first equation, we find that <math>X = A^{-1}(I - UY)</math>, which can be substituted into the second to find <math>VA^{-1}(I - UY) = C^{-1}Y</math>. Expanding and rearranging, we have <math>VA^{-1} = \left(C^{-1} + VA^{-1}U\right)Y</math>, or <math>\left(C^{-1} + VA^{-1}U\right)^{-1}VA^{-1} = Y</math>. Finally, we substitute into our <math>AX + UY = I</math>, and we have <math>AX + U\left(C^{-1} + VA^{-1}U\right)^{-1}VA^{-1} = I</math>. Thus,
:<math>(A + UCV)^{-1} = X = A^{-1} - A^{-1}U\left(C^{-1} + VA^{-1}U\right)^{-1}VA^{-1}.</math>

We have derived the Woodbury matrix identity.
{{collapse bottom}}

{{collapse top|title=Derivation from LDU decomposition}}
We start by the matrix
<math display="block">\begin{bmatrix} A & U \\ V & C \end{bmatrix}</math>
By eliminating the entry under the ''A'' (given that ''A'' is invertible) we get
<math display="block">\begin{bmatrix} I & 0 \\ -VA^{-1} & I \end{bmatrix} 
\begin{bmatrix} A & U \\ V & C \end{bmatrix} = \begin{bmatrix} A & U \\ 0 & C - VA^{-1}U \end{bmatrix}
</math>

Likewise, eliminating the entry above ''C'' gives
<math display="block">\begin{bmatrix} A & U \\ V & C \end{bmatrix} \begin{bmatrix} I & -A^{-1}U \\ 0 & I \end{bmatrix} 
= \begin{bmatrix} A & 0 \\ V & C-VA^{-1}U \end{bmatrix}
</math>

Now combining the above two, we get
<math display="block">
  \begin{bmatrix} I & 0 \\ -VA^{-1} & I \end{bmatrix} \begin{bmatrix} A & U \\ V & C \end{bmatrix}\begin{bmatrix} I & -A^{-1}U \\ 0 & I \end{bmatrix} =
  \begin{bmatrix} A & 0 \\ 0 & C - VA^{-1}U \end{bmatrix}
</math>

Moving to the right side gives
<math display="block">\begin{bmatrix} A & U \\ V & C \end{bmatrix} = \begin{bmatrix} I & 0 \\ VA^{-1} & I \end{bmatrix} \begin{bmatrix} A & 0 \\ 0 & C - VA^{-1}U \end{bmatrix} \begin{bmatrix} I & A^{-1}U \\ 0 & I \end{bmatrix}</math>
which is the LDU decomposition of the block matrix into an upper triangular, diagonal, and lower triangular matrices.

Now inverting both sides gives
<math display="block">\begin{align}
  \begin{bmatrix} A & U \\ V & C \end{bmatrix}^{-1} 
    &= \begin{bmatrix} I & A^{-1}U \\ 0 & I \end{bmatrix}^{-1} \begin{bmatrix} A & 0 \\ 0 & C - VA^{-1}U \end{bmatrix}^{-1} \begin{bmatrix} I & 0 \\ VA^{-1} & I \end{bmatrix}^{-1} \\[8pt]
    &= \begin{bmatrix} I & -A^{-1}U \\ 0 & I \end{bmatrix} \begin{bmatrix} A^{-1} & 0 \\ 0 & \left(C - VA^{-1}U\right)^{-1} \end{bmatrix} \begin{bmatrix} I & 0 \\ -VA^{-1} & I \end{bmatrix} \\[8pt]
    &= \begin{bmatrix} A^{-1} + A^{-1}U\left(C - VA^{-1}U\right)^{-1}VA^{-1} & -A^{-1}U\left(C - VA^{-1}U\right)^{-1} \\ -\left(C - VA^{-1}U\right)^{-1}VA^{-1} & \left(C - VA^{-1}U\right)^{-1} \end{bmatrix}  \qquad\mathrm{(1)}
\end{align}</math>

We could equally well have done it the other way (provided that ''C'' is invertible) i.e.
<math display="block">\begin{bmatrix} A & U \\ V & C \end{bmatrix} = \begin{bmatrix} I & UC^{-1} \\ 0 & I \end{bmatrix} \begin{bmatrix} A - UC^{-1}V & 0 \\ 0 & C \end{bmatrix} \begin{bmatrix} I & 0 \\ C^{-1}V  & I\end{bmatrix}</math>

Now again inverting both sides,
<math display="block">\begin{align}
  \begin{bmatrix} A & U \\ V & C \end{bmatrix}^{-1}
    &= \begin{bmatrix} I & 0 \\ C^{-1}V  & I\end{bmatrix}^{-1} \begin{bmatrix} A - UC^{-1}V & 0 \\ 0 & C \end{bmatrix}^{-1} \begin{bmatrix} I & UC^{-1} \\ 0 & I \end{bmatrix}^{-1} \\[8pt]
    &= \begin{bmatrix} I & 0 \\ -C^{-1}V  & I\end{bmatrix} \begin{bmatrix} \left(A - UC^{-1}V\right)^{-1} & 0 \\ 0 & C^{-1} \end{bmatrix} \begin{bmatrix} I & -UC^{-1} \\ 0 & I \end{bmatrix} \\[8pt]
    &= \begin{bmatrix} \left(A - UC^{-1}V\right)^{-1} & -\left(A - UC^{-1}V\right)^{-1}UC^{-1} \\ -C^{-1}V\left(A - UC^{-1}V\right)^{-1} & C^{-1} + C^{-1}V\left(A - UC^{-1}V\right)^{-1}UC^{-1} \end{bmatrix} \qquad\mathrm{(2)}
\end{align}</math>

Now comparing elements (1, 1) of the RHS of (1) and (2) above gives the Woodbury formula
<math display="block">\left(A - UC^{-1}V\right)^{-1} = A^{-1} + A^{-1}U\left(C - VA^{-1}U\right)^{-1}VA^{-1}.</math>
{{collapse bottom}}

== Applications ==

This identity is useful in certain numerical computations where ''A''<sup>&minus;1</sup> has already been computed and it is desired to compute (''A''&nbsp;+&nbsp;''UCV'')<sup>&minus;1</sup>.  With the inverse of ''A'' available, it is only necessary to find the inverse of ''C''<sup>−1</sup>&nbsp;+&nbsp;''VA''<sup>−1</sup>''U'' in order to obtain the result using the right-hand side of the identity.  If ''C'' has a much smaller dimension than ''A'', this is more efficient than inverting ''A''&nbsp;+&nbsp;''UCV'' directly. A common case is finding the inverse of a low-rank update ''A''&nbsp;+&nbsp;''UCV'' of ''A'' (where ''U'' only has a few columns and ''V'' only a few rows), or finding an approximation of the inverse of the matrix ''A''&nbsp;+&nbsp;''B'' where the matrix ''B'' can be approximated by a low-rank matrix ''UCV'', for example using the [[singular value decomposition]].

This is applied, e.g., in the [[Kalman filter]] and [[recursive least squares]] methods, to replace the [[parametric solution]], requiring inversion of a state vector sized matrix, with a condition equations based solution. In case of the Kalman filter this matrix has the dimensions of the vector of observations, i.e., as small as 1 in case only one new observation is processed at a time. This significantly speeds up the often real time calculations of the filter.

In the case when ''C'' is the identity matrix ''I'', the matrix <math>I+VA^{-1}U</math> is known in [[numerical linear algebra]] and [[numerical partial differential equations]] as the '''capacitance matrix'''.<ref name="hager"/>

==See also==
*[[Sherman–Morrison formula]]
*[[Schur complement]]
*[[Matrix determinant lemma]], formula for a rank-''k'' update to a [[determinant]]
*[[Invertible matrix]]
*{{slink|Moore–Penrose pseudoinverse#Updating the pseudoinverse}}

== Notes ==
{{reflist}}
*{{Citation| last1=Press|first1=WH| last2=Teukolsky|first2=SA| last3=Vetterling|first3=WT| last4=Flannery|first4=BP| year=2007 | title=Numerical Recipes: The Art of Scientific Computing|edition=3rd|publisher=Cambridge University Press| location=New York | isbn=978-0-521-88068-8|chapter=Section 2.7.3. Woodbury Formula|chapter-url=http://apps.nrbook.com/empanel/index.html?pg=80}}

== External links ==
* [http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/identity.html Some matrix identities]
* {{MathWorld|title=Woodbury formula|urlname=WoodburyFormula}}

[[Category:Lemmas in linear algebra]]
[[Category:Matrices (mathematics)]]
[[Category:Matrix theory]]