Editing Diagonalizable matrix

{{Short description|Matrices similar to diagonal matrices}}
{{About|matrix diagonalization in linear algebra||Diagonalization (disambiguation){{!}}Diagonalization}}
{{Use American English|date = April 2019}}

In [[linear algebra]], a [[square matrix]] <math>A</math> is called '''diagonalizable''' or '''non-defective''' if it is [[matrix similarity|similar]] to a [[diagonal matrix]]. That is, if there exists an [[invertible matrix]] <math>P</math> and a diagonal matrix <math>D</math> such that {{nowrap|<math>P^{-1}AP=D</math>}}. This is equivalent to {{nowrap|<math>A = PDP^{-1}</math>.}} (Such {{nowrap|<math>P</math>,}} <math>D</math> are not unique.) This property exists for any linear map: for a [[dimension (vector space)|finite-dimensional]] [[vector space]] {{nowrap|<math>V</math>,}} a [[linear map]] <math>T:V\to V</math> is called '''diagonalizable''' if there exists an [[Basis (linear algebra)#Ordered bases and coordinates|ordered basis]] of <math>V</math> consisting of [[eigenvector]]s of <math>T</math>. These definitions are equivalent: if <math>T</math> has a [[matrix (mathematics)|matrix]] representation <math>A = PDP^{-1}</math> as above, then the column vectors of <math>P</math> form a basis consisting of eigenvectors of {{nowrap|<math>T</math>,}} and the diagonal entries of <math>D</math> are the corresponding [[eigenvalue]]s of {{nowrap|<math>T</math>;}} with respect to this eigenvector basis, <math>T</math> is represented by {{nowrap|<math>D</math>.}} 

'''Diagonalization''' is the process of finding the above <math>P</math> and {{nowrap|<math>D</math>}} and makes many subsequent computations easier. One can raise a diagonal matrix <math>D</math> to a power by simply raising the diagonal entries to that power. The [[determinant]] of a diagonal matrix is simply the product of all diagonal entries. Such computations generalize easily to {{nowrap|<math>A=PDP^{-1}</math>.}} 

The geometric transformation represented by a diagonalizable matrix is an ''[[inhomogeneous dilation]]'' (or ''anisotropic scaling''). That is, it can [[Scaling (geometry)|scale]] the space by a different amount in different directions. The direction of each eigenvector is scaled by a factor given by the corresponding eigenvalue.

A square matrix that is not diagonalizable is called ''[[Defective matrix|defective]]''. It can happen that a matrix <math>A</math> with [[real number|real]] entries is defective over the real numbers, meaning that <math>A = PDP^{-1}</math> is impossible for any invertible <math>P</math> and diagonal <math>D</math> with real entries, but it is possible with [[complex number|complex]] entries, so that <math>A</math> is diagonalizable over the complex numbers. For example, this is the case for a generic [[rotation matrix]].  

Many results for diagonalizable matrices hold only over an [[algebraically closed field]] (such as the complex numbers). In this case, diagonalizable matrices are [[Dense set|dense]] in the space of all matrices, which means any defective matrix can be deformed into a diagonalizable matrix by a small [[Perturbation theory|perturbation]]; and the [[Jordan–Chevalley decomposition]] states that any matrix is uniquely the sum of a diagonalizable matrix and a [[nilpotent matrix]]. Over an algebraically closed field, diagonalizable matrices are equivalent to [[Semi-simplicity#Semi-simple matrices|semi-simple matrices]].

== Definition ==
A square <math>n \times n</math> matrix <math>A</math> with entries in a [[field (mathematics)|field]] <math>F</math> is called '''diagonalizable''' or '''nondefective''' if there exists an <math>n \times n</math> invertible matrix (i.e. an element of the [[general linear group]] GL<sub>''n''</sub>(''F'')), <math>P</math>, such that <math>P^{-1}AP</math> is a diagonal matrix.

== Characterization ==
The fundamental fact about diagonalizable maps and matrices is expressed by the following:

* An <math>n \times n</math> matrix <math>A</math> over a field <math>F</math> is diagonalizable [[if and only if]] the sum of the [[dimension (linear algebra)|dimension]]s of its eigenspaces is equal to <math>n</math>, which is the case if and only if there exists a [[basis (linear algebra)|basis]] of <math>F^n</math> consisting of eigenvectors of <math>A</math>. If such a basis has been found, one can form the matrix <math>P</math> having these [[basis vectors]] as columns, and <math>P^{-1}AP</math> will be a diagonal matrix whose diagonal entries are the eigenvalues of <math>A</math>. The matrix <math>P</math> is known as a [[modal matrix]] for <math>A</math>.
* A linear map <math>T : V \to V</math> is diagonalizable if and only if the sum of the [[dimension (linear algebra)|dimension]]s of its eigenspaces is equal to {{nowrap|<math>\dim(V)</math>,}} which is the case if and only if there exists a basis of <math>V</math> consisting of eigenvectors of <math>T</math>. With respect to such a basis, <math>T</math> will be represented by a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of {{nowrap|<math>T</math>.}}

The following sufficient (but not necessary) condition is often useful.
* An <math>n \times n</math> matrix <math>A</math> is diagonalizable over the field <math>F</math> if it has <math>n</math> distinct eigenvalues in {{nowrap|<math>F</math>,}} i.e. if its [[characteristic polynomial]] has <math>n</math> distinct roots in {{nowrap|<math>F</math>;}} however, the converse may be false. Consider <math display="block">\begin{bmatrix} 
-1 & 3 & -1 \\
-3 & 5 & -1 \\
-3 & 3 & 1 
\end{bmatrix},</math> which has eigenvalues 1, 2, 2 (not all distinct) and is diagonalizable with diagonal form ([[similar (linear algebra)|similar]] to {{nowrap|<math>A</math>)}} <math display="block">\begin{bmatrix}
1 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 2
\end{bmatrix}</math> and [[change of basis|change of basis matrix]] <math>P</math>: <math display="block">\begin{bmatrix}
1 & 1 & -1 \\
1 & 1 & 0 \\
1 & 0 & 3
\end{bmatrix}.</math> The converse fails when <math>A</math> has an eigenspace of dimension higher than 1. In this example, the eigenspace of <math>A</math> associated with the eigenvalue 2 has dimension 2.
* A linear map <math>T : V \to V</math> with <math>n = \dim(V)</math> is diagonalizable if it has <math>n</math> distinct eigenvalues, i.e. if its characteristic polynomial has <math>n</math> distinct roots in <math>F</math>.

Let <math>A</math> be a matrix over {{nowrap|<math>F</math>.}} If <math>A</math> is diagonalizable, then so is any power of it. Conversely, if <math>A</math> is invertible, <math>F</math> is algebraically closed, and <math>A^n</math> is diagonalizable for some <math>n</math> that is not an integer multiple of the characteristic of {{nowrap|<math>F</math>,}} then <math>A</math> is diagonalizable. Proof: If <math>A^n</math> is diagonalizable, then <math>A</math> is annihilated by some polynomial {{nowrap|<math>\left(x^n - \lambda_1\right) \cdots \left(x^n - \lambda_k\right)</math>,}} which has no multiple root (since {{nowrap|<math>\lambda_j \ne 0</math>)}} and is divided by the minimal polynomial of {{nowrap|<math>A</math>.}}

Over the complex numbers <math>\Complex</math>, almost every matrix is diagonalizable. More precisely: the set of complex <math>n \times n</math> matrices that are ''not'' diagonalizable over {{nowrap|<math>\Complex</math>,}} considered as a [[subset]] of {{nowrap|<math>\Complex^{n \times n}</math>,}} has [[Lebesgue measure]] zero. One can also say that the diagonalizable matrices form a dense subset with respect to the [[Zariski topology]]: the non-diagonalizable matrices lie inside the [[Algebraic variety|vanishing set]] of the [[discriminant]] of the characteristic polynomial, which is a [[hypersurface]]. From that follows also density in the usual (''strong'') topology given by a [[norm (mathematics)|norm]]. The same is not true over {{nowrap|<math>\R</math>.}}

The [[Jordan–Chevalley decomposition]] expresses an operator as the sum of its semisimple (i.e., diagonalizable) part and its [[nilpotent]] part. Hence, a matrix is diagonalizable if and only if its nilpotent part is zero. Put in another way, a matrix is diagonalizable if each block in its [[Jordan form]] has no nilpotent part; i.e., each "block" is a one-by-one matrix.

== Diagonalization ==
{{See also||Eigendecomposition of a matrix}}Consider the two following arbitrary bases <math>E = \{{ \boldsymbol{e}_i | \forall i \in [n] } \}  </math> and <math>F = \{ {\boldsymbol{\alpha}_i | \forall i \in [n] } \} </math>. Suppose that there exists a linear transformation represented by a matrix <math>A_E </math> which is written with respect to basis E. Suppose also that there exists the following eigen-equation:

<math>A_E \boldsymbol{\alpha}_{E,i} = \lambda_i \boldsymbol{\alpha}_{E,i} </math>

The alpha eigenvectors are written also with respect to the E basis. Since the set F is both a set of eigenvectors for matrix A and it spans some arbitrary vector space, then we say that there exists a matrix <math>D_F </math> which is a diagonal matrix that is similar to <math>A_E </math>. In other words, <math>A_E </math> is a diagonalizable matrix if the matrix is written in the basis F. We perform the change of basis calculation using the transition matrix <math>S </math>, which changes basis from E to F as follows:

<math>D_F = S_{E}^F \ A_E  \ S_{E}^{-1F} </math>,

where <math>S_{E}^F  </math> is the transition matrix from E-basis to F-basis. The inverse can then be equated to a new transition matrix <math>P  </math> which changes basis from F to E instead and so we have the following relationship : 

<math>S_{E}^{-1 F} = P_{F}^{E}  </math>

Both <math>S   </math> and <math>P  </math> transition matrices are invertible. Thus we can manipulate the matrices in the following fashion:<math display="block">\begin{align}
    D = S \ A_{E} \ S^{-1} \\
    D = P^{-1} \ A_{E} \ P 
\end{align}</math>The matrix <math>A_{E}   </math> will be denoted as <math>A   </math>, which is still in the E-basis. Similarly, the diagonal matrix is in the F-basis.

[[File:Diagonalization as rotation.gif|400px|thumb|right|The diagonalization of a symmetric matrix can be interpreted as a rotation of the axes to align them with the eigenvectors.]]
If a matrix <math>A</math> can be diagonalized, that is,

: <math>P^{-1}AP = \begin{bmatrix}
  \lambda_1 &         0 &  \cdots &         0 \\
          0 & \lambda_2 &  \cdots &         0 \\
     \vdots &    \vdots & \ddots &    \vdots \\
          0 &         0 &  \cdots & \lambda_n
\end{bmatrix} = D,</math>

then:

: <math>AP = P\begin{bmatrix}
  \lambda_1 &         0 &  \cdots &         0 \\
          0 & \lambda_2 &  \cdots &         0 \\
     \vdots &    \vdots & \ddots &    \vdots \\
          0 &         0 &  \cdots & \lambda_n
\end{bmatrix}.</math>

The transition matrix S has the E-basis vectors as columns written in the basis F. Inversely, the inverse transition matrix P has F-basis vectors <math>\boldsymbol{\alpha}_i  </math> written in the basis of E so that we can represent P in block matrix form in the following manner:

:<math>P = \begin{bmatrix} \boldsymbol{\alpha}_{E,1} & \boldsymbol{\alpha}_{E,2} & \cdots & \boldsymbol{\alpha}_{E,n} \end{bmatrix},</math>

as a result we can write:<math display="block">\begin{align}
    A \begin{bmatrix} \boldsymbol{\alpha}_{E,1} & \boldsymbol{\alpha}_{E,2} & \cdots & \boldsymbol{\alpha}_{E,n} \end{bmatrix} =  \begin{bmatrix} \boldsymbol{\alpha}_{E,1} & \boldsymbol{\alpha}_{E,2} & \cdots & \boldsymbol{\alpha}_{E,n} \end{bmatrix}D.
\end{align}</math>

In block matrix form, we can consider the A-matrix to be a matrix of 1x1 dimensions whilst P is a 1xn dimensional matrix. The D-matrix can be written in full form with all the diagonal elements as an nxn dimensional matrix:

<math> A \begin{bmatrix} \boldsymbol{\alpha}_{E,1} & \boldsymbol{\alpha}_{E,2} & \cdots & \boldsymbol{\alpha}_{E,n} \end{bmatrix}=  \begin{bmatrix} \boldsymbol{\alpha}_{E,1} & \boldsymbol{\alpha}_{E,2} & \cdots & \boldsymbol{\alpha}_{E,n} \end{bmatrix}
\begin{bmatrix}
  \lambda_1 &         0 &  \cdots &         0 \\
          0 & \lambda_2 &  \cdots &         0 \\
     \vdots &    \vdots & \ddots &    \vdots \\
          0 &         0 &  \cdots & \lambda_n
\end{bmatrix}.   </math>

Performing the above matrix multiplication we end up with the following result:<math display="block">\begin{align}
        A \begin{bmatrix} \boldsymbol{\alpha}_1 & \boldsymbol{\alpha}_2 & \cdots & \boldsymbol{\alpha}_n \end{bmatrix} =  \begin{bmatrix} \lambda_1 \boldsymbol{\alpha}_1 & \lambda_2\boldsymbol{\alpha}_2 & \cdots & \lambda_n \boldsymbol{\alpha}_n  \end{bmatrix}


\end{align}</math>Taking each component of the block matrix individually on both sides, we end up with the following:
:<math>A\boldsymbol{\alpha}_i = \lambda_i \boldsymbol{\alpha}_i \qquad (i=1,2,\dots,n).</math>

So the column vectors of <math>P</math> are [[right eigenvector]]s of {{nowrap|<math>A</math>,}} and the corresponding diagonal entry is the corresponding [[eigenvalue]]. The invertibility of <math>P</math> also suggests that the eigenvectors are [[linearly independent]] and form a basis of {{nowrap|<math>F^{n}</math>.}} This is the necessary and sufficient condition for diagonalizability and the canonical approach of diagonalization. The [[row vector]]s of <math>P^{-1}</math> are the [[left eigenvector]]s of {{nowrap|<math>A</math>.}}

When a complex matrix <math>A\in\mathbb{C}^{n\times n}</math> is a [[Hermitian matrix]] (or more generally a [[normal matrix]]), eigenvectors of <math>A</math> can be chosen to form an [[orthonormal basis]] of {{nowrap|<math>\mathbb{C}^n</math>,}} and <math>P</math> can be chosen to be a [[unitary matrix]]. If in addition, <math>A\in\mathbb{R}^{n\times n}</math> is a real [[symmetric matrix]], then its eigenvectors can be chosen to be an orthonormal basis of <math>\mathbb{R}^n</math> and <math>P</math> can be chosen to be an [[orthogonal matrix]].

For most practical work matrices are diagonalized numerically using computer software. [[eigenvalue algorithm|Many algorithms]] exist to accomplish this.

== Simultaneous diagonalization ==
{{See also|Triangular matrix#Simultaneous triangularisability|l1=Simultaneous triangularisability|Weight (representation theory)|Positive definite matrix#Simultaneous_diagonalization|l3=Positive definite matrix}}

A set of matrices is said to be ''simultaneously diagonalizable'' if there exists a single invertible matrix <math>P</math> such that <math>P^{-1}AP</math> is a diagonal matrix for every <math>A</math> in the set. The following theorem characterizes simultaneously diagonalizable matrices: A set of diagonalizable [[Commuting matrices|matrices commutes]] if and only if the set is simultaneously diagonalizable.<ref name="HornJohnson">{{cite book|title=Matrix Analysis, second edition|last1=Horn|first1=Roger A.|last2=Johnson|first2=Charles R.|publisher=Cambridge University Press|year=2013|isbn=9780521839402}}</ref>{{rp|p. 64}}

The set of all <math>n \times n</math> diagonalizable matrices (over {{nowrap|<math>\Complex</math>)}} with <math>n > 1</math> is not simultaneously diagonalizable. For instance, the matrices

:<math> \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} \quad\text{and}\quad \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix} </math>

are diagonalizable but not simultaneously diagonalizable because they do not commute.

A set consists of commuting [[normal matrix|normal matrices]] if and only if it is simultaneously diagonalizable by a [[unitary matrix]]; that is, there exists a unitary matrix <math>U</math> such that <math>U^{*} AU</math> is diagonal for every <math>A</math> in the set.

In the language of [[Lie theory]], a set of simultaneously diagonalizable matrices generates a [[toral Lie algebra]].

== Examples ==

=== Diagonalizable matrices ===
* [[Involution (mathematics)|Involution]]s are diagonalizable over the reals (and indeed any field of characteristic not 2), with ±1 on the diagonal.
* Finite order [[endomorphism]]s are diagonalizable over <math>\mathbb{C}</math> (or any algebraically closed field where the characteristic of the field does not divide the order of the endomorphism) with [[roots of unity]] on the diagonal. This follows since the minimal polynomial is [[separable polynomial|separable]], because the roots of unity are distinct.
* [[Projection (linear algebra)|Projections]] are diagonalizable, with 0s and 1s on the diagonal.
* Real [[symmetric matrices]] are diagonalizable by [[orthogonal matrix|orthogonal matrices]]; i.e., given a real symmetric matrix {{nowrap|<math>A</math>,}} <math>Q^{\mathrm T}AQ</math> is diagonal for some orthogonal matrix {{nowrap|<math>Q</math>.}} More generally, matrices are diagonalizable by [[unitary matrix|unitary matrices]] if and only if they are [[normal matrix|normal]]. In the case of the real symmetric matrix, we see that {{nowrap|<math>A=A^{\mathrm T}</math>,}} so clearly <math>AA^{\mathrm T} = A^{\mathrm T}A</math> holds. Examples of normal matrices are real symmetric (or [[Skew-symmetric matrix|skew-symmetric]]) matrices (e.g. covariance matrices) and [[Hermitian matrix|Hermitian matrices]] (or skew-Hermitian matrices). See [[spectral theorem]]s for generalizations to infinite-dimensional vector spaces.

=== Matrices that are not diagonalizable ===
In general, a [[rotation matrix]] is not diagonalizable over the reals, but all [[rotation matrix#Independent planes|rotation matrices]] are diagonalizable over the complex field. Even if a matrix is not diagonalizable, it is always possible to "do the best one can", and find a matrix with the same properties consisting of eigenvalues on the leading diagonal, and either ones or zeroes on the superdiagonal – known as [[Jordan Normal Form|Jordan normal form]].

Some matrices are not diagonalizable over any field, most notably nonzero [[nilpotent matrix|nilpotent matrices]]. This happens more generally if the [[Eigenvalues and eigenvectors#Algebraic multiplicity|algebraic and geometric multiplicities]] of an eigenvalue do not coincide. For instance, consider

:<math> C = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}. </math>

This matrix is not diagonalizable: there is no matrix <math>U</math> such that <math>U^{-1}CU</math> is a diagonal matrix. Indeed, <math>C</math> has one eigenvalue (namely zero) and this eigenvalue has algebraic multiplicity 2 and geometric multiplicity 1.

Some real matrices are not diagonalizable over the reals. Consider for instance the matrix

:<math> B = \left[\begin{array}{rr} 0 & 1 \\ \!-1 & 0 \end{array}\right]. </math>

The matrix <math>B</math> does not have any real eigenvalues, so there is no '''real''' matrix <math>Q</math> such that <math>Q^{-1}BQ</math> is a diagonal matrix. However, we can diagonalize <math>B</math> if we allow complex numbers. Indeed, if we take

:<math> Q = \begin{bmatrix} 1 & i \\ i & 1 \end{bmatrix}, </math>

then <math>Q^{-1}BQ</math> is diagonal. It is easy to find that <math>B</math> is the rotation matrix which rotates counterclockwise by angle <math display="inline">\theta = -\frac{\pi}{2}</math>

Note that the above examples show that the sum of diagonalizable matrices need not be diagonalizable.

=== How to diagonalize a matrix ===
Diagonalizing a matrix is the same process as finding its [[eigenvalues and eigenvectors]], in the case that the eigenvectors form a basis. For example, consider the matrix

:<math>A=\left[\begin{array}{rrr}
0 & 1 & \!\!\!-2\\
0 & 1 & 0\\
1 & \!\!\!-1 & 3
\end{array}\right].</math>

The roots of the [[characteristic polynomial]] <math>p(\lambda)=\det(\lambda I-A)</math> are the eigenvalues {{nowrap|<math>\lambda_1 = 1,\lambda_2 = 1,\lambda_3 = 2</math>.}} Solving the linear system <math>\left(1I-A\right) \mathbf{v} = \mathbf{0}</math> gives the eigenvectors <math>\mathbf{v}_1 = (1,1,0)</math> and {{nowrap|<math>\mathbf{v}_2 = (0,2,1)</math>,}} while <math>\left(2I-A\right)\mathbf{v} = \mathbf{0}</math> gives {{nowrap|<math>\mathbf{v}_3 = (1,0,-1)</math>;}} that is, <math>A \mathbf{v}_i = \lambda_i \mathbf{v}_i</math> for {{nowrap|<math>i = 1,2,3</math>.}} These vectors form a basis of {{nowrap|<math>V = \mathbb{R}^3</math>,}} so we can assemble them as the column vectors of a [[Change of basis|change-of-basis]] matrix <math>P</math> to get:
<math display="block">P^{-1}AP =
\left[\begin{array}{rrr}
1 & 0 & 1\\
1 & 2 & 0\\
0 & 1 & \!\!\!\!-1
\end{array}\right]^{-1}

\left[\begin{array}{rrr}
0 & 1 & \!\!\!-2\\
0 & 1 & 0\\
1 & \!\!\!-1 & 3
\end{array}\right]

\left[\begin{array}{rrr}
1 & \,0 & 1\\
1 & 2 & 0\\
0 & 1 & \!\!\!\!-1
\end{array}\right]
=
\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2 \end{bmatrix} = D .</math>
We may see this equation in terms of transformations: <math>P</math> takes the standard basis to the eigenbasis, {{nowrap|<math>P \mathbf{e}_i = \mathbf{v}_i</math>,}} so we have:
<math display="block">P^{-1} AP \mathbf{e}_i =
P^{-1} A \mathbf{v}_i =
P^{-1} (\lambda_i\mathbf{v}_i) =
\lambda_i\mathbf{e}_i,</math>
so that <math>P^{-1} AP</math> has the standard basis as its eigenvectors, which is the defining property of {{nowrap|<math>D</math>.}} 

Note that there is no preferred order of the eigenvectors in {{nowrap|<math>P</math>;}} changing the order of the [[eigenvectors]] in <math>P</math> just changes the order of the [[eigenvalues]] in the diagonalized form of {{nowrap|<math>A</math>.}}<ref>{{cite book| last1=Anton |first1=H. |last2= Rorres|first2= C. |title=Elementary Linear Algebra (Applications Version) | url=https://archive.org/details/studentsolutions00grob | url-access=registration |publisher=John Wiley & Sons|edition=8th|date=22 Feb 2000| ISBN= 978-0-471-17052-5}}</ref>

== Application to matrix functions ==
Diagonalization can be used to efficiently compute the powers of a matrix {{nowrap|<math>A = PDP^{-1}</math>:}}

: <math>\begin{align} 
  A^k &= \left(PDP^{-1}\right)^k = \left(PDP^{-1}\right) \left(PDP^{-1}\right) \cdots \left(PDP^{-1}\right) \\
      &= PD\left(P^{-1}P\right) D \left(P^{-1}P\right) \cdots \left(P^{-1}P\right) D P^{-1} = PD^kP^{-1},
\end{align}</math>

and the latter is easy to calculate since it only involves the powers of a diagonal matrix. For example, for the matrix <math>A</math> with eigenvalues <math>\lambda = 1,1,2</math> in the example above we compute:

: <math>\begin{align}
  A^k = PD^kP^{-1}
  &= \left[\begin{array}{rrr}
       1 & \,0 &          1 \\
       1 &   2 &          0 \\
       0 &   1 & \!\!\!\!-1
     \end{array}\right]
     \begin{bmatrix} 1^k & 0 & 0 \\ 0 & 1^k & 0 \\ 0 & 0 & 2^k \end{bmatrix}
     \left[\begin{array}{rrr}
       1 & \,0 &          1 \\
       1 &   2 &          0 \\
       0 &   1 & \!\!\!\!-1
     \end{array}\right]^{-1} \\[1em]
  &= \begin{bmatrix}
         2 - 2^k & -1 + 2^k &  2 - 2^{k + 1} \\
               0 &        1 &              0 \\
        -1 + 2^k &  1 - 2^k & -1 + 2^{k + 1}
      \end{bmatrix}.
\end{align}</math>

This approach can be generalized to [[matrix exponential]] and other [[matrix function]]s that can be defined as power series. For example, defining {{nowrap|<math display="inline">\exp(A) = I + A + \frac{1}{2!}A^2 + \frac{1}{3!}A^3 + \cdots</math>,}} we have:
: <math>\begin{align}
  \exp(A) = P \exp(D) P^{-1}
  &= \left[\begin{array}{rrr}
       1 & \,0 &          1 \\
       1 &   2 &          0 \\
       0 &   1 & \!\!\!\!-1
     \end{array}\right]
     \begin{bmatrix} e^1 & 0 & 0 \\ 0 & e^1 & 0 \\ 0 & 0 & e^2 \end{bmatrix}
     \left[\begin{array}{rrr}
       1 & \,0 & 1\\
       1 & 2 & 0\\
       0 & 1 & \!\!\!\!-1
     \end{array}\right]^{-1} \\[1em]
  &= \begin{bmatrix}
       2 e - e^2 & -e + e^2 & 2 e - 2 e^2 \\
               0 &        e &           0 \\
        -e + e^2 &  e - e^2 &  -e + 2 e^2
     \end{bmatrix}.
\end{align}</math>

This is particularly useful in finding closed form expressions for terms of [[linear recursive sequences]], such as the [[Fibonacci number#Matrix form|Fibonacci numbers]].

=== Particular application ===
For example, consider the following matrix:

:<math>M = \begin{bmatrix}a & b - a\\ 0 & b\end{bmatrix}.</math>

Calculating the various powers of <math>M</math> reveals a  surprising pattern:

:<math>
  M^2 = \begin{bmatrix}a^2 & b^2-a^2 \\ 0 &b^2 \end{bmatrix},\quad
  M^3 = \begin{bmatrix}a^3 & b^3-a^3 \\ 0 &b^3 \end{bmatrix},\quad
  M^4 = \begin{bmatrix}a^4 & b^4-a^4 \\ 0 &b^4 \end{bmatrix},\quad
  \ldots
</math>

The above phenomenon can be explained by diagonalizing {{nowrap|<math>M</math>.}}  To accomplish this, we need a basis of <math>\R^2</math> consisting of eigenvectors of {{nowrap|<math>M</math>.}}  One such eigenvector basis is given by

:<math>
  \mathbf{u} = \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \mathbf{e}_1,\quad
  \mathbf{v} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \mathbf{e}_1 + \mathbf{e}_2,
</math>

where '''e'''<sub>''i''</sub> denotes the standard basis of '''R'''<sup>''n''</sup>. The reverse change of basis is given by

:<math>\mathbf{e}_1 = \mathbf{u},\qquad \mathbf{e}_2 = \mathbf{v} - \mathbf{u}.</math>

Straightforward calculations show that

:<math>M\mathbf{u} = a\mathbf{u},\qquad M\mathbf{v} = b\mathbf{v}.</math>

Thus, ''a'' and ''b'' are the eigenvalues corresponding to '''u''' and '''v''', respectively. By linearity of matrix multiplication, we have that

:<math> M^n \mathbf{u} = a^n \mathbf{u},\qquad M^n \mathbf{v} = b^n \mathbf{v}.</math>

Switching back to the standard basis, we have

:<math>\begin{align}
  M^n \mathbf{e}_1 &= M^n \mathbf{u} = a^n \mathbf{e}_1, \\
  M^n \mathbf{e}_2 &= M^n \left(\mathbf{v} - \mathbf{u}\right) = b^n \mathbf{v} - a^n\mathbf{u} = \left(b^n - a^n\right) \mathbf{e}_1 + b^n\mathbf{e}_2.
\end{align}</math>

The preceding relations, expressed in matrix form, are

:<math>M^n = \begin{bmatrix} a^n & b^n - a^n \\ 0 & b^n \end{bmatrix}, </math>

thereby explaining the above phenomenon.

== Quantum mechanical application ==
In [[quantum mechanics|quantum mechanical]] and [[quantum chemistry|quantum chemical]] computations matrix diagonalization is one of the most frequently applied numerical processes. The basic reason is that the time-independent [[Schrödinger equation]] is an eigenvalue equation, albeit in most of the physical situations on an infinite dimensional [[Hilbert space]].

A very common approximation is to truncate (or project) the Hilbert space to finite dimension, after which the  Schrödinger equation can be formulated as an eigenvalue problem of a real symmetric, or complex Hermitian matrix. Formally this approximation is founded on the [[variational principle]], valid for Hamiltonians that are bounded from below.

[[Perturbation theory (quantum mechanics)#First order corrections|First-order perturbation theory]] also leads to matrix eigenvalue problem for degenerate states.

== See also ==
* [[Defective matrix]]
* [[Scaling (geometry)]]
* [[Triangular matrix]]
* [[Semisimple operator]]
* [[Diagonalizable group]]
* [[Jordan normal form]]
* [[Weight module]] – associative algebra generalization
* [[Orthogonal diagonalization]]

== Notes ==
{{notelist}}

== References ==
{{reflist}}

{{Matrix classes}}

[[Category:Matrices (mathematics)]]