Editing Hessenberg matrix

{{Short description|Kind of square matrix in linear algebra}}
In [[linear algebra]], a '''Hessenberg matrix''' is a special kind of [[square matrix]], one that is "almost" [[Triangular matrix|triangular]]. To be exact, an '''upper Hessenberg matrix''' has zero entries below the first [[diagonal#Matrices|subdiagonal]], and a '''lower Hessenberg matrix''' has zero entries above the first [[Diagonal#Matrices|superdiagonal]].<ref>{{harvtxt|Horn|Johnson|1985}}, page 28; {{harvtxt|Stoer|Bulirsch|2002}}, page 251</ref> They are named after [[Karl Hessenberg]].<ref>Biswa Nath Datta (2010) Numerical Linear Algebra and Applications, 2nd Ed., Society for Industrial and Applied Mathematics (SIAM) {{ISBN|978-0-89871-685-6}}, p. 307</ref>

A '''Hessenberg decomposition''' is a [[matrix decomposition]] of a matrix <math>A</math> into a [[unitary matrix]] <math>P</math> and a Hessenberg matrix <math>H</math> such that <math display=block>PHP^*=A</math> where <math>P^*</math> denotes the [[conjugate transpose]].

==Definitions==
===Upper Hessenberg matrix===
A square <math>n \times n</math> matrix <math>A</math> is said to be in '''upper Hessenberg form''' or to be an '''upper Hessenberg matrix''' if <math>a_{i,j}=0</math> for all <math>i,j</math> with <math>i > j+1</math>.

An upper Hessenberg matrix is called '''unreduced''' if all subdiagonal entries are nonzero, i.e. if <math>a_{i+1,i} \neq 0</math> for all <math>i \in \{ 1,\ldots,n-1 \}</math>.<ref>{{harvnb|Horn|Johnson|1985|p=35}}</ref>

===Lower Hessenberg matrix===
A square <math>n \times n</math> matrix <math>A</math> is said to be in '''lower Hessenberg form''' or to be a '''lower Hessenberg matrix''' if its transpose <math></math> is an upper Hessenberg matrix or equivalently if <math>a_{i,j}=0</math> for all <math>i,j</math> with <math>j > i+1</math>.

A lower Hessenberg matrix is called '''unreduced''' if all superdiagonal entries are nonzero, i.e. if <math>a_{i,i+1} \neq 0</math> for all <math>i \in \{ 1,\ldots,n-1 \}</math>.

==Examples==
Consider the following matrices.
<math display="block">A=\begin{bmatrix}
1 & 4 & 2 & 3 \\
3 & 4 & 1 & 7 \\
0 & 2 & 3 & 4 \\
0 & 0 & 1 & 3 \\
\end{bmatrix}</math>
<math display="block">B=\begin{bmatrix}
1 & 2 & 0 & 0 \\
5 & 2 & 3 & 0 \\
3 & 4 & 3 & 7 \\
5 & 6 & 1 & 1 \\
\end{bmatrix}</math>
<math display="block">C=\begin{bmatrix}
1 & 2 & 0 & 0 \\
5 & 2 & 0 & 0 \\
3 & 4 & 3 & 7 \\
5 & 6 & 1 & 1 \\
\end{bmatrix}</math>

The matrix <math>A</math> is an upper unreduced Hessenberg matrix, <math>B</math> is a lower unreduced Hessenberg matrix and <math>C</math> is a lower Hessenberg matrix but is not unreduced.

==Computer programming==
Many linear algebra [[algorithm]]s require significantly less [[computational complexity theory|computational effort]] when applied to [[triangular matrix|triangular matrices]], and this improvement often carries over to Hessenberg matrices as well. If the constraints of a linear algebra problem do not allow a general matrix to be conveniently reduced to a triangular one, reduction to Hessenberg form is often the next best thing. In fact, reduction of any matrix to a Hessenberg form can be achieved in a finite number of steps (for example, through [[Householder transformation|Householder's transformation]] of unitary similarity transforms). Subsequent reduction of Hessenberg matrix to a triangular matrix can be achieved through iterative procedures, such as shifted [[QR decomposition|QR]]-factorization. In [[eigenvalue algorithm]]s, the Hessenberg matrix can be further reduced to a triangular matrix through Shifted QR-factorization combined with deflation steps. Reducing a general matrix to a Hessenberg matrix and then reducing further to a triangular matrix, instead of directly reducing a general matrix to a triangular matrix, often economizes the arithmetic involved in the [[QR algorithm]] for eigenvalue problems.

==Reduction to Hessenberg matrix==
===Householder transformations===
Any <math>n \times n</math> matrix can be transformed into a Hessenberg matrix by a similarity transformation using [[Householder transformation]]s. The following procedure for such a transformation is adapted from '''A Second Course In Linear Algebra''' by ''Garcia & Roger''.<ref>{{cite book |last1=Ramon Garcia |first1=Stephan |last2=Horn |first2=Roger |title=A Second Course In Linear Algebra |date=2017 |publisher=Cambridge University Press |isbn=9781107103818}}</ref>

Let <math>A</math> be any real or complex <math>n \times n</math> matrix, then let <math>A^\prime</math> be the <math>(n - 1) \times n</math> submatrix of <math>A</math> constructed by removing the first row in <math>A</math> and let <math>\mathbf{a}^\prime_1</math> be the first column of <math>A'</math>. Construct the <math>(n-1) \times (n-1)</math> householder matrix <math>V_1 = I_{(n-1)} - 2\frac{ww^*}{\|w\|^2}</math> where
<math display="block">
w = \begin{cases}
\|\mathbf{a}^\prime_1\|_2\mathbf{e}_1 - \mathbf{a}^\prime_1 \;\;\;\;\;\;\;\; , \;\;\; a^\prime_{11} = 0 \\
\|\mathbf{a}^\prime_1\|_2\mathbf{e}_1 + \frac{\overline{a^\prime_{11}}}{|a^\prime_{11}|}\mathbf{a}^\prime_1 \;\;\; , \;\;\; a^\prime_{11} \neq 0 \\
\end{cases}
</math>

This householder matrix will map <math>\mathbf{a}^\prime_1</math> to <math>\|\mathbf{a}^\prime_1\| \mathbf{e}_1</math> and as such, the block matrix <math>U_1 = \begin{bmatrix}1 & \mathbf{0} \\ \mathbf{0} & V_1 \end{bmatrix}</math> will map the matrix <math>A</math> to the matrix <math>U_1A</math> which has only zeros below the second entry of the first column. Now construct <math>(n-2) \times (n-2)</math> householder matrix <math>V_2</math> in a similar manner as <math>V_1</math> such that <math>V_2</math> maps the first column of <math>A^{\prime\prime}</math> to <math>\|\mathbf{a}^{\prime\prime}_1\| \mathbf{e}_1</math>, where <math>A^{\prime\prime}</math> is the submatrix of <math>A^{\prime}</math> constructed by removing the first row and the first column of <math>A^{\prime}</math>, then let <math>U_2 = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & V_2\end{bmatrix}</math> which maps <math>U_1A</math> to the matrix <math>U_2U_1A</math> which has only zeros below the first and second entry of the subdiagonal. Now construct <math>V_3</math> and then <math>U_3</math> in a similar manner, but for the matrix <math>A^{\prime\prime\prime}</math> constructed by removing the first row and first column of <math>A^{\prime\prime}</math> and proceed as in the previous steps. Continue like this for a total of <math>n-2</math> steps.

By construction of <math>U_k</math>, the first <math>k</math> columns of any <math>n \times n</math> matrix are invariant under multiplication by <math>U_k^*</math> from the right. Hence, any matrix can be transformed to an upper Hessenberg matrix by a similarity transformation of the form <math>U_{(n-2)}( \dots (U_2(U_1 A U_1^*)U_2^*) \dots )U_{(n-2)}^* = U_{(n-2)} \dots U_2U_1A(U_{(n-2)} \dots U_2U_1)^* = UAU^*</math>.

===Jacobi (Givens) rotations===
A [[Jacobi rotation]] (also called Givens rotation) is an orthogonal matrix transformation in the form
:<math>
A\to A'=J(p,q,\theta)^TAJ(p,q,\theta) \;,
</math>
where <math>J(p,q,\theta)</math>, <math>p< q</math>, is the Jacobi rotation matrix with all matrix elements equal zero except for
::<math>\left\{\begin{align}
 J(p,q,\theta)_{ii} &{}= 1 \; \forall i \ne p,q\\
 J(p,q,\theta)_{pp} &{}= \cos(\theta) \\
 J(p,q,\theta)_{qq} &{}= \cos(\theta) \\
 J(p,q,\theta)_{pq} &{}= \sin(\theta) \\
 J(p,q,\theta)_{qp} &{}= -\sin(\theta) \;.
\end{align}\right.</math>
One can zero the matrix element <math>A'_{p-1,q}</math> by choosing
the rotation angle <math>\theta</math> to satisfy the equation
:<math>
A_{p-1,p}\sin\theta+A_{p-1,q}\cos\theta=0 \;,
</math>
Now, the sequence of such Jacobi rotations with the following <math>(p,q)</math>
:<math>
(p,q)=(2,3),(2,4),\dots,(2,n),(3,4),\dots,(3,n),\dots,(n-1,n)
</math>
reduces the matrix <math>A</math> to the lower Hessenberg form.<ref>{{cite journal | arxiv=1501.07812 | doi=10.1016/j.laa.2015.08.026 | title=Quasiseparable Hessenberg reduction of real diagonal plus low rank matrices and applications | date=2016 | last1=Bini | first1=Dario A. | last2=Robol | first2=Leonardo | journal=Linear Algebra and Its Applications | volume=502 | pages=186–213 }}</ref>

==Properties==
For <math>n \in \{1, 2\} </math>, it is [[Vacuous truth|vacuously true]] that every <math> n \times n </math> matrix is both upper Hessenberg, and lower Hessenberg.<ref>[https://www.cs.cornell.edu/~bindel/class/cs6210-f16/lec/2016-10-21.pdf Lecture Notes. Notes for 2016-10-21] Cornell University</ref>

The product of a Hessenberg matrix with a triangular matrix is again Hessenberg. More precisely, if <math>A</math> is upper Hessenberg and <math>T</math> is upper triangular, then <math>AT</math> and <math>TA</math> are upper Hessenberg.

A matrix that is both upper Hessenberg and lower Hessenberg is a [[tridiagonal matrix]], of which the [[Jacobi operator|Jacobi matrix]] is an important example. This includes the symmetric or Hermitian Hessenberg matrices. A Hermitian matrix can be reduced to tri-diagonal real symmetric matrices.<ref>{{Cite web|title=Computational Routines (eigenvalues) in LAPACK | url=http://sites.science.oregonstate.edu/~landaur/nacphy/lapack/eigen.html | website=sites.science.oregonstate.edu | access-date=2020-05-24}}</ref>

==Hessenberg operator==
The Hessenberg operator is an infinite dimensional Hessenberg matrix. It commonly occurs as the generalization of the [[Jacobi operator]] to a system of [[orthogonal polynomials]] for the space of [[square-integrable]] [[holomorphic functions]] over some domain—that is, a [[Bergman space]]. In this case, the Hessenberg operator is  the right-[[shift operator]] <math>S</math>, given by 
<math display="block">[Sf](z) = z f(z).</math>

The [[eigenvalue]]s of each principal submatrix of the Hessenberg operator are given by the [[characteristic polynomial]] for that submatrix. These polynomials are called the [[Bergman polynomial]]s, and provide an [[orthogonal polynomials|orthogonal polynomial]] basis for Bergman space.

==See also==
*[[Hessenberg variety]]

== Notes ==
<references/>

== References ==
* {{Citation | last1=Horn | first1=Roger A. | last2=Johnson | first2=Charles R. | title=Matrix Analysis | publisher=[[Cambridge University Press]] | isbn=978-0-521-38632-6 | year=1985}}.
* {{Citation | last1=Stoer | first1=Josef | last2=Bulirsch | first2=Roland | title=Introduction to Numerical Analysis | publisher=[[Springer-Verlag]] | location=Berlin, New York | edition=3rd | isbn=978-0-387-95452-3 | year=2002}}.
*{{Citation | last1=Press | first1=WH | last2=Teukolsky | first2=SA | last3=Vetterling | first3=WT | last4=Flannery | first4=BP | year=2007 | title=Numerical Recipes: The Art of Scientific Computing | edition=3rd | publisher=Cambridge University Press | publication-place=New York | isbn=978-0-521-88068-8 | chapter=Section 11.6.2. Reduction to Hessenberg Form | chapter-url=http://apps.nrbook.com/empanel/index.html#pg=594 | access-date=2011-08-13 | archive-date=2011-08-11 | archive-url=https://web.archive.org/web/20110811154417/http://apps.nrbook.com/empanel/index.html#pg=594 | url-status=dead }}

== External links ==
* [http://mathworld.wolfram.com/HessenbergMatrix.html Hessenberg matrix] at [[MathWorld]].
* {{PlanetMath |urlname=hessenbergmatrix |title=Hessenberg matrix}}
* [http://www.cs.utexas.edu/users/flame/pubs/flawn53.pdf High performance algorithms] for reduction to condensed (Hessenberg, tridiagonal, bidiagonal) form
* [https://www.cs.cornell.edu/~bindel/class/cs6210-f16/lec/2016-10-21.pdf Algorithm overview]

{{Matrix classes}}

{{DEFAULTSORT:Hessenberg Matrix}}
[[Category:Matrices (mathematics)]]