Editing Matrix exponential

{{Short description|Matrix operation generalizing exponentiation of scalar numbers}}
{{Use American English|date = January 2019}}
In [[mathematics]], the '''matrix exponential''' is a [[matrix function]] on [[square matrix|square matrices]] analogous to the ordinary [[exponential function]]. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exponential gives the [[exponential map (Lie theory)|exponential map]] between a matrix [[Lie algebra]] and the corresponding [[Lie group]].

Let {{mvar|X}}  be an {{math|''n'' × ''n''}} [[real number|real]] or [[complex number|complex]] [[matrix (mathematics)|matrix]]. The exponential of {{mvar|X}}, denoted by {{math|''e''<sup>''X''</sup>}} or {{math|exp(''X'')}}, is the {{math|''n'' × ''n''}} matrix given by the [[power series]]

<math display="block">e^X = \sum_{k=0}^\infty \frac{1}{k!} X^k</math>

where <math>X^0</math> is defined to be the identity matrix <math>I</math> with the same dimensions as <math>X</math>, and {{tmath|1=X^k = X X^{{mset|k-1}}}}.<ref>{{harvnb|Hall|2015}} Equation 2.1</ref> The series always converges, so the exponential of {{mvar|X}} is well-defined.

Equivalently,
<math display="block">e^X = \lim_{k \rightarrow \infty} \left(I + \frac{X}{k} \right)^k</math>

for integer-valued {{mvar|k}}, where {{mvar|I}} is the {{math|''n'' × ''n''}} [[identity matrix]]. 

Equivalently, given by the solution to the differential equation

<math display="block">\frac d {dt} e^{X t} = X e^{X t}, \quad e^{X 0} = I</math>

When {{mvar|X}} is an {{math|''n'' × ''n''}} [[diagonal matrix]] then {{math|exp(''X'')}} will be an {{math|''n'' × ''n''}} diagonal matrix with each diagonal element equal to the ordinary [[Exponential function|exponential]] applied to the corresponding diagonal element of {{mvar|X}}.

== Properties ==
=== Elementary properties ===
Let {{math|''X''}} and {{math|''Y''}}  be {{math|''n'' × ''n''}} complex matrices and let {{math|''a''}} and {{math|''b''}} be arbitrary complex numbers. We denote the {{math|''n'' × ''n''}} [[identity matrix]] by {{math|''I''}} and the [[zero matrix]] by 0. The matrix exponential satisfies the following properties.<ref>{{harvnb|Hall|2015}} Proposition 2.3</ref>

We begin with the properties that are immediate consequences of the definition as a power series:
* {{math|1=''e''<sup>0</sup> = ''I''}}
* {{math|1=exp(''X''<sup>T</sup>) = (exp ''X'')<sup>T</sup>}}, where {{math|''X''<sup>T</sup>}} denotes the [[transpose]] of {{math|''X''}}.
* {{math|1=exp(''X''<sup>∗</sup>) = (exp ''X'')<sup>∗</sup>}}, where {{math|''X''<sup>∗</sup>}} denotes the [[conjugate transpose]] of {{math|''X''}}.
* If {{math|''Y''}} is [[invertible matrix|invertible]] then {{math|1=''e''<sup>''YXY''<sup>−1</sup></sup> = ''Ye''<sup>''X''</sup>''Y''<sup>−1</sup>.}}

The next key result is this one:
* If <math>XY=YX</math> then <math>e^Xe^Y=e^{X+Y}</math>.

The proof of this identity is the same as the standard power-series argument for the corresponding identity for the exponential of real numbers. That is to say, ''as long as <math>X</math> and <math>Y</math> commute'', it makes no difference to the argument whether <math>X</math> and <math>Y</math> are numbers or matrices. It is important to note that this identity typically does not hold if <math>X</math> and <math>Y</math> do not commute (see [[#Inequalities for exponentials of Hermitian matrices|Golden-Thompson inequality]] below).

Consequences of the preceding identity are the following:
* {{math|1=''e''<sup>''aX''</sup>''e''<sup>''bX''</sup> = ''e''<sup>(''a'' + ''b'')''X''</sup>}}
* {{math|1=''e''<sup>''X''</sup>''e''<sup>−''X''</sup> = ''I''}}

Using the above results, we can easily verify the following claims. If {{math|''X''}} is [[symmetric matrix|symmetric]] then {{math|''e''<sup>''X''</sup>}} is also symmetric, and if {{math|''X''}} is [[skew-symmetric matrix|skew-symmetric]] then {{math|''e''<sup>''X''</sup>}} is [[orthogonal matrix|orthogonal]]. If {{math|''X''}} is [[Hermitian matrix|Hermitian]] then {{math|''e''<sup>''X''</sup>}} is also Hermitian, and if {{math|''X''}} is [[skew-Hermitian matrix|skew-Hermitian]] then {{math|''e''<sup>''X''</sup>}} is [[unitary matrix|unitary]].

Finally, a [[Laplace transform]] of matrix exponentials amounts to the [[resolvent formalism|resolvent]],
<math display="block">\int_0^\infty e^{-ts}e^{tX}\,dt = (sI - X)^{-1}</math>
for all sufficiently large positive values of {{mvar|s}}.

=== Linear differential equation systems ===

{{Main|Matrix differential equation}}

One of the reasons for the importance of the matrix exponential is that it can be used to solve systems of linear [[ordinary differential equations]]. The solution of
<math display="block"> \frac{d}{dt} y(t) = Ay(t), \quad y(0) = y_0, </math>
where {{mvar|A}} is a constant matrix and ''y'' is a column vector, is given by
<math display="block"> y(t) = e^{At} y_0. </math>

The matrix exponential can also be used to solve the inhomogeneous equation
<math display="block"> \frac{d}{dt} y(t) = Ay(t) + z(t), \quad y(0) = y_0. </math>
See the section on [[#Applications|applications]] below for examples.

There is no closed-form solution for differential equations of the form
<math display="block"> \frac{d}{dt} y(t) = A(t) \, y(t), \quad y(0) = y_0, </math>
where {{mvar|A}} is not constant, but the [[Magnus series]] gives the solution as an infinite sum.

=== The determinant of the matrix exponential ===
By [[Jacobi's formula]], for any complex square matrix the following [[trace identity]] holds:<ref>{{harvnb|Hall|2015}} Theorem 2.12</ref>
{{Equation box 1
|indent =
|equation = <math>\det\left(e^A\right) = e^{\operatorname{tr}(A)}~.</math>
|cellpadding = 6
|border
|border colour = #0073CF
|bgcolor = #000000
}}

In addition to providing a computational tool, this formula demonstrates that a matrix exponential is always an [[invertible matrix]]. This follows from the fact that the right hand side of the above equation is always non-zero, and so {{math|det(''e<sup>A</sup>'') ≠ 0}}, which implies that {{math|''e<sup>A</sup>''}} must be invertible.

In the real-valued case, the formula also exhibits the map
<math display="block">\exp \colon M_n(\R) \to \mathrm{GL}(n, \R)</math>
to not be [[surjective function|surjective]], in contrast to the complex case mentioned earlier. This follows from the fact that, for real-valued matrices, the right-hand side of the formula is always positive, while there exist invertible matrices with a negative determinant.

=== Real symmetric matrices ===

The matrix exponential of a real symmetric matrix is positive definite. Let <math>S</math> be an {{math|''n'' × ''n''}} real symmetric matrix and <math>x \in \R^n</math> a column vector. Using the elementary properties of the matrix exponential and of symmetric matrices, we have:

<math display="block">x^Te^Sx=x^Te^{S/2}e^{S/2}x=x^T(e^{S/2})^Te^{S/2}x =(e^{S/2}x)^Te^{S/2}x=\lVert e^{S/2}x\rVert^2\geq 0.</math>

Since <math>e^{S/2}</math> is invertible, the equality only holds for <math>x=0</math>, and we have <math>x^Te^Sx > 0</math> for all non-zero <math>x</math>. Hence <math>e^S</math> is positive definite.

== The exponential of sums ==

For any real numbers (scalars) {{mvar|x}} and {{mvar|y}} we know that the exponential function satisfies {{math|1=''e''<sup>''x''+''y''</sup> = ''e''<sup>''x''</sup> ''e''<sup>''y''</sup>}}. The same is true for commuting matrices.  If matrices {{mvar|X}} and {{mvar|Y}} commute (meaning that {{math|1=''XY'' = ''YX''}}), then,
<math display="block">e^{X+Y} = e^Xe^Y.</math>

However, for matrices that do not commute the above equality does not necessarily hold.

=== The Lie product formula ===
Even if {{mvar|X}} and {{mvar|Y}} do not commute, the exponential {{math|''e''<sup>''X'' + ''Y''</sup>}} can be computed by the [[Lie product formula]]<ref>{{harvnb|Hall|2015}} Theorem 2.11</ref>
<math display="block">e^{X+Y} = \lim_{k\to\infty} \left(e^{\frac{1}{k}X}e^{\frac{1}{k}Y}\right)^k.</math>

Using a large finite {{mvar|k}} to approximate the above is basis of the Suzuki-Trotter expansion, often used in [[Time-evolving block decimation#The Suzuki–Trotter expansion|numerical time evolution]].

=== The Baker–Campbell–Hausdorff formula ===
In the other direction, if {{mvar|X}} and {{mvar|Y}} are sufficiently small (but not necessarily commuting) matrices, we have
<math display="block">e^Xe^Y = e^Z,</math>
where {{mvar|Z}} may be computed as a series in [[commutator]]s of {{mvar|X}} and {{mvar|Y}} by means of the [[Baker–Campbell–Hausdorff formula]]:<ref>{{harvnb|Hall|2015}} Chapter 5</ref>
<math display="block">Z = X + Y + \frac{1}{2}[X,Y] + \frac{1}{12}[X,[X,Y]] - \frac{1}{12}[Y,[X,Y]]+ \cdots,</math>
where the remaining terms are all iterated commutators involving {{mvar|X}} and {{mvar|Y}}. If {{mvar|X}} and {{mvar|Y}} commute, then all the commutators are zero and we have simply {{math|1=''Z'' = ''X'' + ''Y''}}.

== Inequalities for exponentials of Hermitian matrices ==
{{Main|Golden–Thompson inequality}}
For [[Hermitian matrix|Hermitian matrices]] there is a notable theorem related to the [[Matrix trace|trace]] of matrix exponentials.

If {{mvar|A}} and {{mvar|B}} are Hermitian matrices, then<ref>{{cite book | author=Bhatia, R. | title=Matrix Analysis |series=Graduate Texts in Mathematics|isbn=978-0-387-94846-1 | year = 1997 | publisher=Springer | volume=169}}</ref>
<math display="block">\operatorname{tr}\exp(A + B) \leq \operatorname{tr}\left[\exp(A)\exp(B)\right].</math>

There is no requirement of commutativity.  There are counterexamples to show that the Golden–Thompson inequality cannot be extended to three matrices – and, in any event, {{math|tr(exp(''A'')exp(''B'')exp(''C''))}} is not guaranteed to be real for Hermitian {{math|''A''}}, {{math|''B''}}, {{math|''C''}}.  However, [[Elliott H. Lieb|Lieb]] proved<ref>{{cite journal| doi=10.1016/0001-8708(73)90011-X | doi-access=free | last1=Lieb | first1=Elliott H. | authorlink1=Elliott H. Lieb | title=Convex trace functions and the Wigner–Yanase–Dyson conjecture | journal=[[Advances in Mathematics]]| volume=11 | pages=267–288 | year=1973|issue=3| url = http://www.numdam.org/item/RCP25_1973__19__A4_0/ }}</ref><ref>{{cite journal|doi=10.1007/BF01646492 | author=H. Epstein | title=Remarks on two theorems of E. Lieb | journal= Communications in Mathematical Physics|volume=31|pages=317–325 | year=1973| issue=4 | bibcode=1973CMaPh..31..317E | s2cid=120096681 | url=http://projecteuclid.org/euclid.cmp/1103859039 }}</ref> that it can be generalized to three matrices if we modify the expression as follows
<math display="block">\operatorname{tr}\exp(A + B + C) \leq \int_0^\infty \mathrm{d}t\, \operatorname{tr}\left[e^A\left(e^{-B} + t\right)^{-1}e^C \left(e^{-B} + t\right)^{-1}\right].</math>

== The exponential map ==
The exponential of a matrix is always an [[invertible matrix]]. The inverse matrix of {{math|''e''<sup>''X''</sup>}} is given by {{math|''e''<sup>−''X''</sup>}}. This is analogous to the fact that the exponential of a complex number is always nonzero. The matrix exponential then gives us a map
<math display="block">\exp \colon M_n(\Complex) \to \mathrm{GL}(n, \Complex)</math>
from the space of all ''n'' × ''n'' matrices to the [[general linear group]] of degree {{mvar|n}}, i.e. the [[group (mathematics)|group]] of all ''n'' × ''n'' invertible matrices. In fact, this map is [[surjective]] which means that every invertible matrix can be written as the exponential of some other matrix<ref>{{harvnb|Hall|2015}} Exercises 2.9 and 2.10</ref> (for this, it is essential to consider the field '''C''' of complex numbers and not '''R''').

For any two matrices {{mvar|X}} and {{mvar|Y}},
<math display="block">\left\| e^{X+Y} - e^X\right\| \le \|Y\| e^{\|X\|} e^{\|Y\|}, </math>

where {{math|‖ · ‖}} denotes an arbitrary [[matrix norm]]. It follows that the exponential map is [[continuity (mathematics)|continuous]] and [[Lipschitz continuous]] on [[compact set|compact]] subsets of {{math|''M''<sub>''n''</sub>('''C''')}}.

The map
<math display="block">t \mapsto e^{tX}, \qquad t \in \R</math>
defines a [[Smooth function#Smoothness|smooth]] curve in the general linear group which passes through the identity element at {{math|1=''t'' = 0}}.

In fact, this gives a [[one-parameter subgroup]] of the general linear group since
<math display="block">e^{tX}e^{sX} = e^{(t + s)X}.</math>

The derivative of this curve (or [[tangent vector]]) at a point ''t'' is given by
{{NumBlk||<math display="block">\frac{d}{dt}e^{tX} = Xe^{tX} = e^{tX}X.</math>|{{EquationRef|1}}}}
The derivative at {{math|1=''t'' = 0}} is just the matrix ''X'', which is to say that ''X'' generates this one-parameter subgroup.

More generally,<ref>{{cite journal|doi=10.1063/1.1705306 | author = R. M. Wilcox | title=Exponential Operators and Parameter Differentiation in Quantum Physics | journal=Journal of Mathematical Physics | volume=8 | pages=962–982 | year=1967|issue=4| bibcode = 1967JMP.....8..962W }}</ref> for a generic {{mvar|t}}-dependent exponent, {{math|''X''(''t'')}},
{{Equation box 1
|indent =
|equation = <math>\frac{d}{dt}e^{X(t)} = \int_0^1 e^{\alpha X(t)} \frac{dX(t)}{dt} e^{(1 - \alpha) X(t)}\,d\alpha ~.  </math>
|cellpadding=
|border
|border colour = #0073CF
|bgcolor=#000000
}}

Taking the above expression {{math|''e''<sup>''X''(''t'')</sup>}} outside the integral sign and expanding the integrand with the help of the [[Baker–Campbell–Hausdorff formula|Hadamard lemma]] one can obtain the following useful expression for the derivative of the matrix exponent,<ref>{{harvnb|Hall|2015}} Theorem 5.4</ref>
<math display="block">e^{-X(t)}\left(\frac{d}{dt}e^{X(t)}\right) = \frac{d}{dt}X(t) - \frac{1}{2!} \left[X(t), \frac{d}{dt}X(t)\right] + \frac{1}{3!} \left[X(t), \left[X(t), \frac{d}{dt}X(t)\right]\right] - \cdots </math>

The coefficients in the expression above are different from what appears in the exponential. For a closed form, see [[derivative of the exponential map]].

=== Directional derivatives when restricted to Hermitian matrices ===

Let <math>X</math> be a <math>n \times n</math> Hermitian matrix with distinct eigenvalues. Let <math>X = E \textrm{diag}(\Lambda) E^*</math> be its eigen-decomposition where <math>E</math> is a unitary matrix whose columns are the eigenvectors of <math>X</math>, <math>E^*</math> is its conjugate transpose, and <math>\Lambda = \left(\lambda_1, \ldots, \lambda_n\right)</math> the vector of corresponding eigenvalues. Then, for any <math>n \times n</math> Hermitian matrix <math>V</math>, the [[directional derivative]] of <math>\exp: X \to e^X</math> at <math>X</math> in the direction <math>V</math> is
<ref name="lewis">{{cite journal |
first1=Adrian S. |
last1=Lewis |
first2=Hristo S. |
last2=Sendov |
title=Twice differentiable spectral functions |
journal=SIAM Journal on Matrix Analysis and Applications |
volume=23 |
issue=2 |
pages=368–386 |
date=2001 |
doi=10.1137/S089547980036838X |
url=https://people.orie.cornell.edu/aslewis/publications/01-twice.pdf
}} See Theorem 3.3.</ref>
<ref name="deledalle">{{cite journal |
first1=Charles-Alban |
last1=Deledalle |
first2=Loïc |
last2=Denis |
first3=Florence |
last3=Tupin |
title=Speckle reduction in matrix-log domain for synthetic aperture radar imaging |
journal=Journal of Mathematical Imaging and Vision |
date=2022 |
volume=64 |
issue=3 |
pages=298–320 |
doi=10.1007/s10851-022-01067-1 | doi-access=free
|
bibcode=2022JMIV...64..298D }} See Propositions 1 and 2.
</ref>
<math display="block">
D \exp (X) [V]
\triangleq
\lim_{\epsilon \to 0} 
\frac{1}{\epsilon}
\left(\displaystyle
e^{X + \epsilon V}
-
e^{X}
\right)
=
E(G \odot \bar{V}) E^*
</math>
where <math>\bar{V} = E^* V E</math>, the operator <math>\odot</math> denotes the Hadamard product, and, for all <math>1 \leq i, j \leq n</math>, the matrix <math>G</math> is defined as
<math display="block">
G_{i, j} = \left\{\begin{align}
& \frac{e^{\lambda_i} - e^{\lambda_j}}{\lambda_i - \lambda_j} & \text{ if } i \neq j,\\
& e^{\lambda_i} & \text{ otherwise}.\\
\end{align}\right.
</math>
In addition, for any <math>n \times n</math> Hermitian matrix <math>U</math>, the second directional derivative in directions  <math>U</math> and <math>V</math> is<ref name="deledalle"/>
<math display="block">
D^2 \exp (X) [U, V]
\triangleq
\lim_{\epsilon_u \to 0} 
\lim_{\epsilon_v \to 0} 
\frac{1}{4 \epsilon_u \epsilon_v}
\left(\displaystyle
e^{X + \epsilon_u U + \epsilon_v V}
-
e^{X - \epsilon_u U + \epsilon_v V}
-
e^{X + \epsilon_u U - \epsilon_v V}
+
e^{X - \epsilon_u U - \epsilon_v V}
\right)
=
E F(U, V) E^*
</math>
where the matrix-valued function <math>F</math> is defined, for all <math>1 \leq i, j \leq n</math>, as
<math display="block">
F(U, V)_{i,j} = \sum_{k=1}^n \phi_{i,j,k}(\bar{U}_{ik}\bar{V}_{jk}^* + \bar{V}_{ik}\bar{U}_{jk}^*)
</math>
with
<math display="block">
\phi_{i,j,k} = \left\{\begin{align}
& \frac{G_{ik} - G_{jk}}{\lambda_i - \lambda_j} & \text{ if } i \ne j,\\
& \frac{G_{ii} - G_{ik}}{\lambda_i - \lambda_k} & \text{ if } i = j \text{ and } k \ne i,\\
& \frac{G_{ii}}{2} & \text{ if } i = j = k.\\
\end{align}\right.
</math>

== Computing the matrix exponential ==

Finding reliable and accurate methods to compute the matrix exponential is difficult, and this is still a topic of considerable current research in mathematics and numerical analysis. [[Matlab]], [[GNU Octave]], [[R (programming_language)|R]], and [[SciPy]] all use the [[Padé approximant]].<ref>{{cite web|url=http://www.mathworks.de/help/techdoc/ref/expm.html |title=Matrix exponential – MATLAB expm – MathWorks Deutschland |publisher=Mathworks.de |date=2011-04-30 |access-date=2013-06-05}}</ref><ref>{{cite web |url=http://www.network-theory.co.uk/docs/octave3/octave_200.html |title=GNU Octave – Functions of a Matrix |publisher=Network-theory.co.uk |date=2007-01-11 |access-date=2013-06-05 |archive-url=https://web.archive.org/web/20150529200317/http://www.network-theory.co.uk/docs/octave3/octave_200.html |archive-date=2015-05-29 |url-status=dead }}</ref><ref>{{cite web| url=https://search.r-project.org/CRAN/refmans/Matrix/html/expm.html|title=R - pkg {Matrix}: Matrix Exponential |date=2005-02-28 |access-date=2023-07-17}}</ref><ref>{{cite web| url=http://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.linalg.expm.html |title=scipy.linalg.expm function documentation |publisher=The SciPy Community |date=2015-01-18 |access-date=2015-05-29}}</ref> In this section, we discuss methods that are applicable in principle to any matrix, and which can be carried out explicitly for small matrices.<ref>See {{harvnb|Hall|2015}} Section 2.2</ref> Subsequent sections describe methods suitable for numerical evaluation on large matrices.

=== Diagonalizable case ===

If a matrix is [[diagonal matrix|diagonal]]:
<math display="block">A = \begin{bmatrix}
a_1 & 0 & \cdots & 0 \\
0 & a_2 & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & a_n
\end{bmatrix} ,</math>
then its exponential can be obtained by exponentiating each entry on the main diagonal:
<math display="block">e^A = \begin{bmatrix}
e^{a_1} & 0 & \cdots & 0 \\
0 & e^{a_2} & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & e^{a_n}
\end{bmatrix} .</math>

This result also allows one to exponentiate [[diagonalizable matrix|diagonalizable matrices]]. If
{{block indent|em=1.2|text={{math|1=''A'' = ''UDU''<sup>−1</sup>}} }}
then
{{block indent|em=1.2|text={{math|1=''e''<sup>''A''</sup> = ''Ue''<sup>''D''</sup>''U''<sup>−1</sup>}},}}
which is especially easy to compute when {{math|''D''}} is diagonal.

Application of [[Sylvester's formula]] yields the same result.  (To see this, note that addition and multiplication, hence also exponentiation, of diagonal matrices is equivalent to element-wise addition and multiplication, and hence exponentiation; in particular, the "one-dimensional" exponentiation is felt element-wise for the diagonal case.)

===Example : Diagonalizable===
For example, the matrix
<math display="block"> A = \begin{bmatrix}
1 & 4\\
1 & 1\\
\end{bmatrix}</math>
can be diagonalized as
<math display="block">\begin{bmatrix}
-2 & 2\\
1 & 1\\
\end{bmatrix}\begin{bmatrix}
-1 & 0\\
0 & 3\\
\end{bmatrix}\begin{bmatrix}
-2 & 2\\
1 & 1\\
\end{bmatrix}^{-1}.</math>

Thus,
<math display="block">e^A = \begin{bmatrix}
-2 & 2\\
1 & 1\\
\end{bmatrix}e^\begin{bmatrix}
-1 & 0\\
0 & 3\\
\end{bmatrix}\begin{bmatrix}
-2 & 2\\
1 & 1\\
\end{bmatrix}^{-1}=\begin{bmatrix}
-2 & 2\\
1 & 1\\
\end{bmatrix}\begin{bmatrix}
\frac{1}{e} & 0\\
0 & e^3\\
\end{bmatrix}\begin{bmatrix}
-2 & 2\\
1 & 1\\
\end{bmatrix}^{-1} = \begin{bmatrix}
\frac{e^4+1}{2e} & \frac{e^4-1}{e}\\
\frac{e^4-1}{4e} & \frac{e^4+1}{2e}\\
\end{bmatrix}.</math>

=== Nilpotent case ===
A matrix {{mvar|N}} is [[nilpotent matrix|nilpotent]] if {{math|1=''N''<sup>''q''</sup> = 0}} for some integer ''q''. In this case, the matrix exponential {{math|''e''<sup>''N''</sup>}} can be computed directly from the series expansion, as the series terminates after a finite number of terms:

<math display="block">e^N = I + N + \frac{1}{2}N^2 + \frac{1}{6}N^3 + \cdots + \frac{1}{(q - 1)!}N^{q-1} ~.</math>

Since the series has a finite number of steps, it is a matrix polynomial, which can be [[Polynomial evaluation#Matrix polynomials|computed efficiently]].

=== General case ===
==== Using the Jordan–Chevalley decomposition ====
By the [[Jordan–Chevalley decomposition]], any <math>n \times n</math> matrix ''X'' with complex entries can be expressed as
<math display="block">X = A + N </math>
where
* ''A'' is diagonalizable
* ''N'' is nilpotent
* ''A'' [[commutativity|commutes]] with ''N''

This means that we can compute the exponential of ''X'' by reducing to the previous two cases:
<math display="block">e^X = e^{A+N} = e^A e^N. </math>

Note that we need the commutativity of ''A'' and ''N'' for the last step to work.

==== Using the Jordan canonical form ====
A closely related method is, if the field is [[algebraically closed]], to work with the [[Jordan form]] of {{mvar|X}}. Suppose that {{math|1=''X'' = ''PJP''{{i sup|−1}}}} where {{mvar|J}} is the Jordan form of {{mvar|X}}. Then
<math display="block">e^{X} = Pe^{J}P^{-1}.</math>

Also, since
<math display="block">\begin{align}
    J &= J_{a_1}(\lambda_1) \oplus J_{a_2}(\lambda_2) \oplus \cdots \oplus J_{a_n}(\lambda_n), \\
  e^J &= \exp \big( J_{a_1}(\lambda_1) \oplus J_{a_2}(\lambda_2) \oplus \cdots \oplus J_{a_n}(\lambda_n) \big) \\
      &= \exp \big( J_{a_1}(\lambda_1) \big) \oplus \exp \big( J_{a_2}(\lambda_2) \big) \oplus \cdots \oplus \exp \big( J_{a_n}(\lambda_n) \big).
\end{align}</math>

Therefore, we need only know how to compute the matrix exponential of a [[Jordan block]]. But each Jordan block is of the form
<math display="block">\begin{align}
  &            &     J_a(\lambda) &= \lambda I + N \\
  &\Rightarrow & e^{J_a(\lambda)} &= e^{\lambda I + N} = e^\lambda e^N.
\end{align}</math>

where {{mvar|N}} is a special nilpotent matrix. The matrix exponential of {{mvar|J}} is then given by
<math display="block">e^J = e^{\lambda_1} e^{N_{a_1}} \oplus e^{\lambda_2} e^{N_{a_2}} \oplus \cdots \oplus e^{\lambda_n} e^{N_{a_n}}</math>

=== Projection case ===
If {{mvar|P}} is a [[projection matrix]] (i.e. is [[idempotent]]: {{math|1=''P''<sup>2</sup> = ''P''}}), its matrix exponential is:
{{block indent|em=1.2|text= {{math|1=''e''<sup>''P''</sup> = ''I'' + (''e'' − 1)''P''}}.}}

Deriving this by expansion of the exponential function, each power of {{mvar|P}} reduces to {{mvar|P}} which becomes a common factor of the sum:
<math display="block">e^P = \sum_{k=0}^{\infty} \frac{P^k}{k!} = I + \left(\sum_{k=1}^{\infty} \frac{1}{k!}\right)P = I + (e - 1)P ~.</math>

=== Rotation case ===
For a simple rotation in which the perpendicular unit vectors {{Math|'''a'''}} and {{Math|'''b'''}} specify a plane,<ref>in a Euclidean space</ref> the [[Rotation matrix#Exponential map|rotation matrix]] {{mvar|R}} can be expressed in terms of a similar exponential function involving a [[Euler's rotation theorem#Generators of rotations|generator]] {{mvar|G}} and angle {{mvar|θ}}.<ref>{{cite book|last=Weyl|first=Hermann|url=https://books.google.com/books?id=KCgZAQAAIAAJ&pg=PA142 |title=Space Time Matter| year=1952|publisher=Dover|isbn=978-0-486-60267-7|page=142}}</ref><ref>{{cite book|last1=Bjorken|first1=James D.| last2=Drell| first2=Sidney D.|title=Relativistic Quantum Mechanics|url=https://archive.org/details/relativisticquan0000bjor|url-access=registration | publisher=McGraw-Hill|year=1964|page=[https://archive.org/details/relativisticquan0000bjor/page/22 22]}}</ref>
<math display="block">\begin{align}
  G &= \mathbf{ba}^\mathsf{T} - \mathbf{ab}^\mathsf{T} & P &= -G^2 = \mathbf{aa}^\mathsf{T} + \mathbf{bb}^\mathsf{T} \\
  P^2 &= P & PG &= G = GP ~,
\end{align}</math>
<math display="block">\begin{align}
  R\left( \theta \right) = e^{G\theta}
    &= I + G\sin (\theta) + G^2(1 - \cos(\theta)) \\
    &= I - P + P\cos (\theta) + G\sin (\theta ) ~.\\
\end{align}</math>

The formula for the exponential results from reducing the powers of {{mvar|G}} in the series expansion and identifying the respective series coefficients of {{math|''G<sup>2</sup>''}} and {{mvar|G}} with {{math|−cos(''θ'')}} and {{math|sin(''θ'')}} respectively. The second expression here for {{math|''e<sup>Gθ</sup>''}} is the same as the expression for {{math|''R''(''θ'')}} in the article containing the derivation of the [[Euler's rotation theorem#Generators of rotations|generator]], {{math|1=''R''(''θ'') = ''e<sup>Gθ</sup>''}}.

In two dimensions, if <math>a = \left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right]</math> and <math>b = \left[ \begin{smallmatrix} 0 \\ 1 \end{smallmatrix} \right]</math>, then <math>G = \left[ \begin{smallmatrix} 0 & -1 \\ 1 & 0\end{smallmatrix} \right]</math>, <math>G^2 = \left[ \begin{smallmatrix}-1 & 0 \\ 0 & -1\end{smallmatrix} \right]</math>, and
<math display="block">R(\theta) = \begin{bmatrix}\cos(\theta) & -\sin(\theta)\\ \sin(\theta) & \cos(\theta)\end{bmatrix} = I \cos(\theta) + G \sin(\theta)</math>
reduces to the standard matrix for a plane rotation.

The matrix {{math|1=''P'' = −''G''<sup>2</sup>}} [[Projection (linear algebra)|projects]] a vector onto the {{math|ab}}-plane and the rotation only affects this part of the vector. An example illustrating this is a rotation of {{math|1=30° = π/6}} in the plane spanned by {{math|'''a'''}} and {{math|'''b'''}},

<math display="block">\begin{align}
  \mathbf{a} &= \begin{bmatrix} 1 \\ 0 \\ 0 \\ \end{bmatrix} &
  \mathbf{b} &= \frac{1}{\sqrt{5}}\begin{bmatrix} 0 \\ 1 \\ 2 \\ \end{bmatrix}
\end{align}</math>
<math display="block">\begin{align}
  G = \frac{1}{\sqrt{5}}&\begin{bmatrix}
     0 & -1 & -2 \\
     1 &  0 &  0 \\
     2 &  0 &  0 \\
  \end{bmatrix} &
  P = -G^2 &= \frac{1}{5}\begin{bmatrix}
     5 & 0 & 0 \\
     0 & 1 & 2 \\
     0 & 2 & 4 \\
  \end{bmatrix} \\
  P\begin{bmatrix} 1 \\ 2 \\ 3 \\ \end{bmatrix} =
    \frac{1}{5}&\begin{bmatrix} 5 \\ 8 \\ 16 \\ \end{bmatrix} =
    \mathbf{a} + \frac{8}{\sqrt{5}}\mathbf{b} &
  R\left(\frac{\pi}{6}\right) &= \frac{1}{10}\begin{bmatrix}
     5\sqrt{3} &      -\sqrt{5} &     -2\sqrt{5} \\
      \sqrt{5} &   8 + \sqrt{3} & -4 + 2\sqrt{3} \\
     2\sqrt{5} & -4 + 2\sqrt{3} &  2 + 4\sqrt{3} \\
  \end{bmatrix} \\
\end{align}</math>

Let {{math|1=''N'' = ''I'' - ''P''}}, so {{math|1=''N''<sup>2</sup> = ''N''}} and its products with {{math|''P''}} and {{math|''G''}} are zero. This will allow us to evaluate powers of {{math|''R''}}.

<math display="block">\begin{align}
       R\left( \frac{\pi}{6} \right) &= N + P\frac{\sqrt{3}}{2} + G\frac{1}{2} \\
     R\left( \frac{\pi}{6} \right)^2 &= N + P\frac{1}{2} + G\frac{\sqrt{3}}{2} \\
     R\left( \frac{\pi}{6} \right)^3 &= N + G \\
     R\left( \frac{\pi}{6} \right)^6 &= N - P \\
  R\left( \frac{\pi}{6} \right)^{12} &= N + P = I \\
\end{align}</math>

{{further|Rodrigues' rotation formula|Axis–angle representation#Exponential map from so(3) to SO(3)}}

== Evaluation by Laurent series ==
By virtue of the [[Cayley–Hamilton theorem]] the matrix exponential is expressible as a polynomial of order {{mvar|n}}−1.

If {{mvar|P}} and {{math|''Q<sub>t</sub>''}} are nonzero polynomials in one variable, such that {{math|1=''P''(''A'') = 0}}, and if the [[meromorphic function]]
<math display="block">f(z)=\frac{e^{t z}-Q_t(z)}{P(z)}</math>
is [[entire function|entire]], then
<math display="block">e^{t A} = Q_t(A).</math>
To prove this, multiply the first of the two above equalities by {{math|''P''(''z'')}} and replace {{mvar|z}} by {{mvar|A}}.

Such a polynomial {{math|''Q<sub>t</sub>''(''z'')}} can be found as follows−see [[Sylvester's formula]]. Letting {{mvar|a}} be a root of {{mvar|P}}, {{math|''Q<sub>a,t</sub>''(''z'')}} is solved from the product of {{mvar|P}} by the [[Laurent series#Principal part|principal part]] of the [[Laurent series]] of {{mvar|f}} at {{mvar|a}}: It is proportional to the relevant [[Frobenius covariant]]. Then the sum ''S<sub>t</sub>'' of the ''Q<sub>a,t</sub>'', where {{mvar|a}} runs over all the roots of {{mvar|P}}, can be taken as a particular {{math|''Q<sub>t</sub>''}}. All the other ''Q<sub>t</sub>'' will be obtained by adding a multiple of {{mvar|P}} to {{math|''S<sub>t</sub>''(''z'')}}. In particular, {{math|''S<sub>t</sub>''(''z'')}}, the [[Sylvester's formula|Lagrange-Sylvester polynomial]], is the only {{math|''Q<sub>t</sub>''}} whose degree is less than that of {{mvar|P}}.

'''Example''': Consider the case of an arbitrary {{math|2 × 2}} matrix,
<math display="block">A := \begin{bmatrix}
  a & b \\
  c & d
\end{bmatrix}.</math>

The exponential matrix {{math|e<sup>''tA''</sup>}}, by virtue of the [[Cayley–Hamilton theorem]], must be of the form
<math display="block">e^{tA} = s_0(t)\, I + s_1(t)\,A.</math>

(For any complex number {{mvar|z}} and any '''''C'''''-algebra {{mvar|B}}, we denote again by {{mvar|z}} the product of {{mvar|z}} by the unit of {{mvar|B}}.)

Let {{mvar|α}} and {{mvar|β}} be the roots of the [[characteristic polynomial]] of {{mvar|A}},
<math display="block">P(z) = z^2 - (a + d)\ z + ad - bc = (z - \alpha)(z - \beta) ~ .</math>

Then we have
<math display="block">S_t(z) = e^{\alpha t} \frac{z - \beta}{\alpha - \beta} + e^{\beta t} \frac{z - \alpha}{\beta - \alpha}~,</math>
hence
<math display="block">\begin{align}
  s_0(t) &= \frac{\alpha\,e^{\beta t} - \beta\,e^{\alpha t}}{\alpha - \beta}, &
  s_1(t) &= \frac{e^{\alpha t} - e^{\beta t}}{\alpha - \beta}
\end{align}</math>

if {{math|''α'' ≠ ''β''}}; while, if {{math|1=''α'' = ''β''}},
<math display="block">S_t(z) = e^{\alpha t} (1 + t (z - \alpha)) ~,</math>

so that
<math display="block">\begin{align}
  s_0(t) &= (1 - \alpha\,t)\,e^{\alpha t},&
  s_1(t) &= t\,e^{\alpha t}~.
\end{align}</math>

Defining
<math display="block">\begin{align}
  s &\equiv \frac{\alpha + \beta}{2} = \frac{\operatorname{tr} A}{2}~, &
  q &\equiv \frac{\alpha - \beta}{2} = \pm\sqrt{-\det\left(A - sI\right)},
\end{align}</math>

we have
<math display="block">\begin{align}
  s_0(t) &= e^{st}\left(\cosh(qt) - s\frac{\sinh(qt)}{q}\right), &
  s_1(t) &= e^{st}\frac{\sinh(qt)}{q},
\end{align}</math>

where {{math|sin(''qt'')/''q''}} is 0 if {{math|1=''t'' = 0}}, and {{mvar|t}} if {{math|1=''q'' = 0}}.

Thus,
{{Equation box 1
|indent =
|equation = <math>e^{tA}=e^{st}\left(\left(\cosh(qt) - s\frac{\sinh(qt)}{q}\right)~I~ + \frac{\sinh(qt)}{q} A\right) ~.</math>
|cellpadding= 6
|border
|border colour = #0073CF
|bgcolor=#000000
}}

Thus, as indicated above, the matrix {{mvar|A}} having decomposed into the sum of two mutually commuting pieces, the traceful piece and the traceless piece,
<math display="block">A = sI + (A-sI)~,</math>

the matrix exponential reduces to a plain product of the exponentials of the two respective pieces. This is a formula often used in physics, as it amounts to the analog of [[Euler's formula]] for [[Pauli spin matrices#Exponential of a Pauli vector|Pauli spin matrices]], that is rotations of the doublet representation of the group [[SU(2)]].

The polynomial {{math|''S<sub>t</sub>''}} can also be given the following "[[interpolation]]" characterization. Define {{math|''e<sub>t</sub>''(''z'') ≡ ''e<sup>tz</sup>''}}, and {{math|''n'' ≡ deg ''P''}}. Then {{math|''S<sub>t</sub>''(''z'')}} is the unique degree {{math|< ''n''}} polynomial which satisfies {{math|1=''S''<sub>''t''</sub><sup>(''k'')</sup>(''a'') = ''e''<sub>''t''</sub><sup>(''k'')</sup>(''a'')}} whenever {{mvar|k}} is less than the multiplicity of {{mvar|a}} as a root of {{mvar|P}}. We assume, as we obviously can, that {{mvar|P}} is the [[Minimal polynomial (linear algebra)|minimal polynomial]] of {{mvar|A}}. We further assume that {{mvar|A}} is a [[diagonalizable matrix]]. In particular, the roots of {{mvar|P}} are simple, and the "[[interpolation]]" characterization indicates that {{math|''S<sub>t</sub>''}} is given by the [[Lagrange interpolation]] formula, so it is the [[Sylvester's formula|Lagrange−Sylvester polynomial]].

At the other extreme, if {{math|1=''P'' = (''z'' - ''a'')<sup>''n''</sup>}}, then
<math display="block">S_t = e^{at}\ \sum_{k=0}^{n-1}\ \frac{t^k}{k!}\ (z - a)^k ~.</math>

The simplest case not covered by the above observations is when <math>P = (z - a)^2\,(z - b)</math> with {{math|''a'' ≠ ''b''}}, which yields
<math display="block">S_t = e^{at}\ \frac{z - b}{a - b}\ \left(1 + \left(t + \frac{1}{b - a}\right)(z - a)\right) + e^{bt}\ \frac{(z - a)^2}{(b - a)^2}.</math>

== Evaluation by implementation of [[Sylvester's formula]] ==
A practical, expedited computation of the above reduces to the following rapid steps. Recall from above that an {{math|''n'' × ''n''}} matrix {{math|exp(''tA'')}} amounts to a linear combination of the first {{mvar|n}}−1 powers of {{mvar|A}} by the [[Cayley–Hamilton theorem]]. For [[diagonalizable matrix|diagonalizable]] matrices, as illustrated above, e.g. in the {{math|2 × 2}} case, [[Sylvester's formula]] yields {{math|1=exp(''tA'') = ''B<sub>α</sub>'' exp(''tα'') + ''B<sub>β</sub>'' exp(''tβ'')}}, where the {{mvar|B}}s are the [[Frobenius covariant]]s of {{mvar|A}}.

It is easiest, however, to simply solve for these {{mvar|B}}s directly, by evaluating this expression and its first derivative at {{math|1=''t'' = 0}}, in terms of {{mvar|A}} and {{mvar|I}}, to find the same answer as above.

But this simple procedure also works for [[defective matrix|defective]] matrices, in a generalization due to Buchheim.<ref>Rinehart, R. F. (1955). "[https://www.jstor.org/stable/2306996 The equivalence of definitions of a matric function]". ''The American Mathematical Monthly'', '''62''' (6), 395-414.</ref> This is illustrated here for a {{math|4 × 4}} example of a matrix which is ''not diagonalizable'', and the {{mvar|B}}s are not projection matrices.

Consider
<math display="block">A = \begin{bmatrix}
  1 & 1 &           0 &            0 \\
  0 & 1 &           1 &            0 \\
  0 & 0 &           1 & -\frac{1}{8} \\
  0 & 0 & \frac{1}{2} &  \frac{1}{2}
\end{bmatrix} ~,</math>
with eigenvalues {{math|1=''λ''<sub>1</sub> = 3/4}} and {{math|1=''λ''<sub>2</sub> = 1}}, each with a multiplicity of two.

Consider the exponential of each eigenvalue multiplied by {{mvar|t}}, {{math|exp(''λ<sub>i</sub>t'')}}. Multiply each exponentiated eigenvalue by the corresponding undetermined coefficient matrix {{math|''B''<sub>''i''</sub>}}.  If the eigenvalues have an algebraic multiplicity greater than 1, then repeat the process, but now multiplying by an extra factor of {{mvar|t}} for each repetition, to ensure linear independence.

(If one eigenvalue had a multiplicity of three, then there would be the three terms: <math>B_{i_1} e^{\lambda_i t}, ~ B_{i_2} t e^{\lambda_i t}, ~ B_{i_3} t^2 e^{\lambda_i t} </math>. By contrast, when all eigenvalues are distinct, the {{mvar|B}}s are just the [[Frobenius covariant]]s, and solving for them as below just amounts to the inversion of the [[Vandermonde matrix]] of these 4 eigenvalues.)

Sum all such terms, here four such,
<math display="block">\begin{align}
  e^{At} &= B_{1_1} e^{\lambda_1 t} + B_{1_2} t e^{\lambda_1 t} + B_{2_1} e^{\lambda_2 t} + B_{2_2} t e^{\lambda_2 t} , \\
  e^{At} &= B_{1_1} e^{\frac{3}{4} t} + B_{1_2} t e^{\frac{3}{4} t} + B_{2_1} e^{1 t} + B_{2_2} t e^{1 t} ~.
\end{align}</math>

To solve for all of the unknown matrices {{mvar|B}} in terms of the first three powers of {{mvar|A}} and the identity, one needs four equations, the above one providing one such at {{mvar|t}} = 0. Further, differentiate it with respect to {{mvar|t}},
<math display="block">A e^{A t} = \frac{3}{4} B_{1_1} e^{\frac{3}{4} t} + \left( \frac{3}{4} t + 1 \right) B_{1_2} e^{\frac{3}{4} t} + 1 B_{2_1} e^{1 t} + \left(1 t + 1 \right) B_{2_2} e^{1 t} ~,</math>

and again,
<math display="block">\begin{align}
  A^2 e^{At} &= \left(\frac{3}{4}\right)^2 B_{1_1} e^{\frac{3}{4} t} + \left( \left(\frac{3}{4}\right)^2 t + \left( \frac{3}{4} + 1 \cdot \frac{3}{4}\right) \right) B_{1_2} e^{\frac{3}{4} t} + B_{2_1} e^{1 t} + \left(1^2 t + (1 + 1 \cdot 1 )\right) B_{2_2} e^{1 t} \\
             &= \left(\frac{3}{4}\right)^2 B_{1_1} e^{\frac{3}{4} t} + \left( \left(\frac{3}{4}\right)^2 t + \frac{3}{2} \right) B_{1_2} e^{\frac{3}{4} t} + B_{2_1} e^{t} + \left(t + 2\right) B_{2_2} e^{t} ~,
\end{align}</math>

and once more,
<math display="block">\begin{align}
  A^3 e^{At} &= \left(\frac{3}{4}\right)^3 B_{1_1} e^{\frac{3}{4} t} + \left( \left(\frac{3}{4}\right)^3 t + \left( \left(\frac{3}{4}\right)^2 + \left(\frac{3}{2}\right) \cdot \frac{3}{4}\right) \right) B_{1_2} e^{\frac{3}{4} t} + B_{2_1} e^{1 t} + \left(1^3 t + (1 + 2) \cdot 1 \right) B_{2_2} e^{1 t} \\
             &= \left(\frac{3}{4}\right)^3 B_{1_1} e^{\frac{3}{4} t}\! + \left( \left(\frac{3}{4}\right)^3 t\! + \frac{27}{16} \right) B_{1_2} e^{\frac{3}{4} t}\! + B_{2_1} e^{t}\! + \left(t + 3\cdot 1\right) B_{2_2} e^{t} ~.
\end{align}</math>

(In the general case, {{mvar|n}}−1 derivatives need be taken.)

Setting {{mvar|t}} = 0 in these four equations, the four coefficient matrices {{mvar|B}}s may now be solved for,
<math display="block">\begin{align}
I &= B_{1_1} + B_{2_1} \\
A &= \frac{3}{4} B_{1_1} + B_{1_2} + B_{2_1} + B_{2_2} \\
A^2 &= \left(\frac{3}{4}\right)^2 B_{1_1} + \frac{3}{2} B_{1_2} + B_{2_1} + 2 B_{2_2} \\
A^3 &= \left(\frac{3}{4}\right)^3 B_{1_1} + \frac{27}{16} B_{1_2} + B_{2_1} + 3 B_{2_2} ~,
\end{align} </math>

to yield
<math display="block">\begin{align}
B_{1_1} &=  128 A^3 - 366 A^2 + 288 A - 80 I \\
B_{1_2} &=   16 A^3 -  44 A^2 +  40 A - 12 I \\
B_{2_1} &= -128 A^3 + 366 A^2 - 288 A + 80 I \\
B_{2_2} &=   16 A^3 -  40 A^2 +  33 A -  9 I ~.
\end{align}</math>

Substituting with the value for {{mvar|A}} yields the coefficient matrices
<math display="block">\begin{align}
B_{1_1} &= \begin{bmatrix}0 & 0 &  48 & -16\\ 0 & 0 & -8 & 2\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\end{bmatrix}\\
B_{1_2} &= \begin{bmatrix}0 & 0 &   4 &  -2\\ 0 & 0 & -1 & \frac{1}{2}\\ 0 & 0 & \frac{1}{4} & -\frac{1}{8}\\ 0 & 0 & \frac{1}{2} & -\frac{1}{4} \end{bmatrix}\\
B_{2_1} &= \begin{bmatrix}1 & 0 & -48 &  16\\ 0 & 1 & 8 & -2\\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}\\
B_{2_2} &= \begin{bmatrix}0 & 1 &   8 &  -2\\ 0 & 0 & 0 &  0\\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0\end{bmatrix}
\end{align}</math>

so the final answer is
<math display="block">e^{tA} = \begin{bmatrix}
    e^t & te^t & \left(8t - 48\right) e^t\! + \left(4t + 48\right)e^{\frac{3}{4}t} & \left(16 - 2\,t\right)e^t\! + \left(-2t - 16\right)e^{\frac{3}{4}t}\\
    0 & e^t & 8e^t\! + \left(-t - 8\right) e^{\frac{3}{4}t} & -2e^t + \frac{t + 4}{2}e^{\frac{3}{4}t}\\
    0 & 0 & \frac{t + 4}{4}e^{\frac{3}{4}t} & -\frac{t}{8}e^{\frac{3}{4}t}\\
    0 & 0 & \frac{t}{2}e^{\frac{3}{4}t} & -\frac{t - 4}{4}e^{\frac{3}{4}t} ~.
  \end{bmatrix}
</math>

The procedure is much shorter than [[Matrix differential equation#Putzer Algorithm for computing eAt|Putzer's algorithm]] sometimes utilized in such cases.

{{See also|Derivative of the exponential map}}

== Illustrations ==

Suppose that we want to compute the exponential of
<math display="block">B = \begin{bmatrix}
  21 & 17 &  6 \\
  -5 & -1 & -6 \\
   4 &  4 & 16
\end{bmatrix}.</math>

Its [[Jordan normal form|Jordan form]] is
<math display="block">J = P^{-1}BP = \begin{bmatrix}
  4 &  0 &  0 \\
  0 & 16 &  1 \\
  0 &  0 & 16
\end{bmatrix},</math>
where the matrix ''P'' is given by
<math display="block">P = \begin{bmatrix}
  -\frac14 &  2 &  \frac54 \\
   \frac14 & -2 & -\frac14 \\
         0 &  4 &        0
\end{bmatrix}.</math>

Let us first calculate exp(''J''). We have
<math display="block">J = J_1(4) \oplus J_2(16) </math>

The exponential of a {{math|1 × 1}} matrix is just the exponential of the one entry of the matrix, so {{math|1=exp(''J''<sub>1</sub>(4)) = [''e''<sup>4</sup>]}}. The exponential of ''J''<sub>2</sub>(16) can be calculated by the formula {{math|1=''e''<sup>(λ''I'' + ''N'')</sup> = ''e''<sup>''λ''</sup> ''e''<sup>N</sup>}} mentioned above; this yields<ref>This can be generalized; in general, the exponential of {{math|''J''<sub>''n''</sub>(''a'')}} is an upper triangular matrix with {{math|''e''<sup>''a''</sup>/0!}} on the main diagonal, {{math|''e''<sup>''a''</sup>/1!}} on the one above, {{math|''e''<sup>''a''</sup>/2!}} on the next one, and so on.</ref>

<math display="block">\begin{align}
        &\exp \left( \begin{bmatrix} 16 & 1 \\ 0 & 16 \end{bmatrix} \right)
    = e^{16} \exp \left( \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \right) = \\[6pt]
  {}={} &e^{16} \left(\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} + \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} + {1 \over 2!}\begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} + \cdots {} \right)
    = \begin{bmatrix} e^{16} & e^{16} \\ 0 & e^{16} \end{bmatrix}.
\end{align}
</math>

Therefore, the exponential of the original matrix {{mvar|B}} is
<math display="block">\begin{align}
  \exp(B)
  &= P \exp(J) P^{-1}
   = P \begin{bmatrix} e^4 & 0 & 0 \\ 0 & e^{16} & e^{16} \\ 0 & 0 & e^{16} \end{bmatrix} P^{-1} \\[6pt]
  &= {1 \over 4} \begin{bmatrix}
    13e^{16} - e^4 & 13e^{16} - 5e^4 &  2e^{16} - 2e^4 \\
    -9e^{16} + e^4 & -9e^{16} + 5e^4 & -2e^{16} + 2e^4 \\
    16e^{16}       & 16e^{16}        &  4e^{16}
  \end{bmatrix}.
\end{align}</math>

== Applications ==

=== Linear differential equations ===

The matrix exponential has applications to systems of [[linear differential equation]]s.  (See also [[matrix differential equation]].)  Recall from earlier in this article that a ''homogeneous'' differential equation of the form
<math display="block"> \mathbf{y}' = A\mathbf{y} </math>
has solution {{math|''e''<sup>''At''</sup> '''y'''(0)}}.

If we consider the vector
<math display="block"> \mathbf{y}(t) = \begin{bmatrix} y_1(t) \\ \vdots \\y_n(t) \end{bmatrix}   ~,</math>
we can express a system of ''inhomogeneous'' coupled linear differential equations as
<math display="block"> \mathbf{y}'(t) = A\mathbf{y}(t)+\mathbf{b}(t).</math>
Making an [[ansatz]] to use an integrating factor of {{math|''e''<sup>−''At''</sup>}} and multiplying throughout, yields
<math display="block">\begin{align}
  &            &       e^{-At}\mathbf{y}'-e^{-At}A\mathbf{y} &= e^{-At}\mathbf{b} \\
  &\Rightarrow &       e^{-At}\mathbf{y}'-Ae^{-At}\mathbf{y} &= e^{-At}\mathbf{b} \\
  &\Rightarrow & \frac{d}{dt} \left(e^{-At}\mathbf{y}\right) &= e^{-At}\mathbf{b}~.
\end{align}</math>

The second step is possible due to the fact that, if {{math|1=''AB'' = ''BA''}}, then {{math|1=''e''<sup>''At''</sup>''B'' = ''Be''<sup>''At''</sup>}}. So, calculating {{math|''e''<sup>''At''</sup>}} leads to the solution to the system, by simply integrating the third step with respect to {{mvar|t}}.

A solution to this can be obtained by integrating and multiplying by <math>e^{\textbf{A}t}</math> to eliminate the exponent in the LHS. Notice that while <math>e^{\textbf{A}t}</math> is a matrix, given that it is a matrix exponential, we can say that <math>e^{\textbf{A}t} e^{-\textbf{A}t} = I</math>. In other words, <math>\exp{\textbf{A}t} = \exp{{(-\textbf{A}t)}^{-1}}</math>.

==== Example (homogeneous) ====
Consider the system
<math display="block">\begin{matrix}
  x' &=& 2x & -y &  +z \\
  y' &=&    & 3y & -1z \\
  z' &=& 2x & +y & +3z
\end{matrix}~.</math>

The associated [[defective matrix]] is
<math display="block">A = \begin{bmatrix}
  2 & -1 &  1 \\
  0 &  3 & -1 \\
  2 &  1 &  3
\end{bmatrix}~.</math>

The matrix exponential is
<math display="block">e^{tA} = \frac{1}{2}\begin{bmatrix}
    e^{2t}\left( 1 + e^{2t} - 2t\right) & -2te^{2t}       &  e^{2t}\left(-1 + e^{2t}\right) \\
   -e^{2t}\left(-1 + e^{2t} - 2t\right) &  2(t + 1)e^{2t} & -e^{2t}\left(-1 + e^{2t}\right) \\
    e^{2t}\left(-1 + e^{2t} + 2t\right) &  2te^{2t}       &  e^{2t}\left( 1 + e^{2t}\right)
\end{bmatrix}~,</math>

so that the general solution of the homogeneous system is
<math display="block">\begin{bmatrix}x \\y \\ z\end{bmatrix} =
    \frac{x(0)}{2}\begin{bmatrix}e^{2t}\left(1 + e^{2t} - 2t\right) \\ -e^{2t}\left(-1 + e^{2t} - 2t\right) \\ e^{2t}\left(-1 + e^{2t} + 2t\right)\end{bmatrix}
  + \frac{y(0)}{2}\begin{bmatrix}-2te^{2t} \\ 2(t + 1)e^{2t} \\ 2te^{2t}\end{bmatrix}
  + \frac{z(0)}{2}\begin{bmatrix}e^{2t}\left(-1 + e^{2t}\right) \\ -e^{2t}\left(-1 + e^{2t}\right) \\ e^{2t}\left(1 + e^{2t}\right)\end{bmatrix} ~,
</math>

amounting to
<math display="block">\begin{align}
  2x &= x(0)e^{2t}\left(1 + e^{2t} - 2t\right) + y(0)\left(-2te^{2t}\right) + z(0)e^{2t}\left(-1 + e^{2t}\right) \\[2pt]
  2y &= x(0)\left(-e^{2t}\right)\left(-1 + e^{2t} - 2t\right) + y(0)2(t + 1)e^{2t} + z(0)\left(-e^{2t}\right)\left(-1 + e^{2t}\right) \\[2pt]
  2z &= x(0)e^{2t}\left(-1 + e^{2t} + 2t\right) + y(0)2te^{2t} + z(0)e^{2t}\left(1 + e^{2t}\right) ~.
\end{align}</math>

==== Example (inhomogeneous) ====
Consider now the inhomogeneous system
<math display="block">\begin{matrix}
  x' &=& 2x & - & y & + & z & + & e^{2t} \\
  y' &=&    &   & 3y& - & z & \\
  z' &=& 2x & + & y & + & 3z & + & e^{2t}
\end{matrix} ~.</math>

We again have
<math display="block">A = \left[\begin{array}{rrr}
  2 & -1 &  1 \\
  0 &  3 & -1 \\
  2 &  1 &  3
\end{array}\right] ~,</math>

and
<math display="block">\mathbf{b} = e^{2t}\begin{bmatrix}1 \\0\\1\end{bmatrix}.</math>

From before, we already have the general solution to the homogeneous equation. Since the sum of the homogeneous and particular solutions give the general solution to the inhomogeneous problem, we now only need find the particular solution.

We have, by above,
<math display="block">\begin{align}
  \mathbf{y}_p
  &= e^{tA}\int_0^t e^{(-u)A}\begin{bmatrix}e^{2u} \\0\\e^{2u}\end{bmatrix}\,du+e^{tA}\mathbf{c} \\[6pt]
  &= e^{tA}\int_0^t \begin{bmatrix}
         2e^u - 2ue^{2u} & -2ue^{2u}    & 0 \\
    -2e^u + 2(u+1)e^{2u} & 2(u+1)e^{2u} & 0 \\
                2ue^{2u} & 2ue^{2u}     & 2e^u
  \end{bmatrix}\begin{bmatrix}e^{2u} \\0 \\e^{2u}\end{bmatrix}\,du + e^{tA}\mathbf{c} \\[6pt]
  &= e^{tA}\int_0^t \begin{bmatrix}
     e^{2u}\left( 2e^u - 2ue^{2u}\right) \\
     e^{2u}\left(-2e^u + 2(1 + u)e^{2u}\right) \\
    2e^{3u} + 2ue^{4u}
  \end{bmatrix}\,du + e^{tA}\mathbf{c} \\[6pt]
  &= e^{tA}\begin{bmatrix}
    -{1 \over 24}e^{3t}\left(3e^t(4t - 1) - 16\right) \\
     {1 \over 24}e^{3t}\left(3e^t(4t + 4) - 16\right) \\
     {1 \over 24}e^{3t}\left(3e^t(4t - 1) - 16\right)
  \end{bmatrix} + \begin{bmatrix}
           2e^t - 2te^{2t} &      -2te^{2t} & 0 \\
    -2e^t + 2(t + 1)e^{2t} & 2(t + 1)e^{2t} & 0 \\
                  2te^{2t} &       2te^{2t} & 2e^t
  \end{bmatrix}\begin{bmatrix}c_1 \\c_2 \\c_3\end{bmatrix} ~,
\end{align}</math>
which could be further simplified to get the requisite particular solution determined through variation of parameters.
Note '''c''' = '''y'''<sub>''p''</sub>(0). For more rigor, see the following generalization.

=== Inhomogeneous case generalization:  variation of parameters ===
For the inhomogeneous case, we can use [[integrating factor]]s (a method akin to [[variation of parameters]]). We seek a particular solution of the form {{math|1='''y'''<sub>p</sub>(''t'') = exp(''tA'') '''z'''(''t'')}},
<math display="block">\begin{align}
  \mathbf{y}_p'(t)
    & = \left(e^{tA}\right)'\mathbf{z}(t) + e^{tA}\mathbf{z}'(t) \\[6pt]
    & = Ae^{tA}\mathbf{z}(t) + e^{tA}\mathbf{z}'(t) \\[6pt]
    & = A\mathbf{y}_p(t) + e^{tA}\mathbf{z}'(t)~.
\end{align}</math>

For {{math|'''y'''<sub>''p''</sub>}} to be a solution,
<math display="block">\begin{align}
  e^{tA}\mathbf{z}'(t) &= \mathbf{b}(t) \\[6pt]
        \mathbf{z}'(t) &= \left(e^{tA}\right)^{-1}\mathbf{b}(t) \\[6pt]
         \mathbf{z}(t) &= \int_0^t e^{-uA}\mathbf{b}(u)\,du + \mathbf{c} ~.
\end{align}</math>

Thus,
<math display="block">\begin{align}
  \mathbf{y}_p(t)
    & = e^{tA}\int_0^t e^{-uA}\mathbf{b}(u)\,du + e^{tA}\mathbf{c} \\
    & = \int_0^t e^{(t - u)A}\mathbf{b}(u)\,du + e^{tA}\mathbf{c}~,
\end{align}</math>
where {{math|'''''c'''''}} is determined by the initial conditions of the problem.

More precisely, consider the equation
<math display="block">Y' - A\ Y = F(t)</math>

with the initial condition {{math|1=''Y''(''t''<sub>0</sub>) = ''Y''<sub>0</sub>}}, where
* {{mvar|A}} is an {{mvar|n}} by {{mvar|n}} complex matrix,
* {{mvar|F}} is a continuous function from some open interval {{mvar|I}} to {{math|'''C'''<sup>''n''</sup>}},
* <math>t_0</math> is a point of {{mvar|I}}, and
* <math>Y_0</math> is a vector of {{math|'''C'''<sup>''n''</sup>}}.

Left-multiplying the above displayed equality by {{math|''e<sup>−tA</sup>''}} yields
<math display="block">Y(t) = e^{(t - t_0)A}\ Y_0 + \int_{t_0}^t e^{(t - x)A}\ F(x)\ dx ~.</math>

We claim that the solution to the equation
<math display="block">P(d/dt)\ y = f(t)</math>

with the initial conditions <math>y^{(k)}(t_0) = y_k</math> for {{math|0 ≤ ''k'' < ''n''}} is
<math display="block">y(t) = \sum_{k=0}^{n-1}\ y_k\ s_k(t - t_0) + \int_{t_0}^t s_{n-1}(t - x)\ f(x)\ dx ~,</math>

where the notation is as follows:
* <math>P\in\mathbb{C}[X]</math> is a monic polynomial of degree {{math|''n'' > 0}},
* {{mvar|f}} is a continuous complex valued function defined on some open interval {{mvar|I}},
* <math>t_0</math> is a point of {{mvar|I}},
* <math>y_k</math> is a complex number, and

{{math|''s<sub>k</sub>''(''t'')}}  is the coefficient of <math>X^k</math> in the polynomial denoted by <math>S_t\in\mathbb{C}[X]</math> in Subsection [[matrix exponential#Evaluation by Laurent series|Evaluation by Laurent series]] above.

To justify this claim, we transform our order {{mvar|n}} scalar equation into an order one vector equation by the usual [[Ordinary differential equation#Reduction to a first-order system|reduction to a first order system]]. Our vector equation takes the form
<math display="block">\frac{dY}{dt} - A\ Y = F(t),\quad Y(t_0) = Y_0,</math>
where {{mvar|A}} is the [[transpose]] [[companion matrix]] of {{mvar|P}}. We solve this equation as explained above, computing the matrix exponentials by the observation made in Subsection [[matrix exponential#Evaluation by implementation of Sylvester's formula|Evaluation by implementation of Sylvester's formula]] above.

In the case {{mvar|n}} = 2 we get the following statement. The solution to
<math display="block">
  y'' - (\alpha + \beta)\ y' + \alpha\,\beta\ y = f(t),\quad
  y(t_0) = y_0,\quad
  y'(t_0) = y_1
</math>

is
<math display="block">y(t) = y_0\ s_0(t - t_0) + y_1\ s_1(t - t_0) + \int_{t_0}^t s_1(t - x)\,f(x)\ dx,</math>

where the functions {{math|''s''<sub>0</sub>}}  and {{math|''s''<sub>1</sub>}} are as in Subsection [[matrix exponential#Evaluation by Laurent series|Evaluation by Laurent series]] above.

== Matrix-matrix exponentials ==
The matrix exponential of another matrix (matrix-matrix exponential),<ref>{{cite web |url=http://www.rockefeller.edu/labheads/cohenje/PDFs/215BarrabasCohenalApp19941.pdf |author=Ignacio Barradas and Joel E. Cohen |date=1994 |title=Iterated Exponentiation, Matrix-Matrix Exponentiation, and Entropy |publisher=Academic Press, Inc. |url-status=dead |archive-url=https://web.archive.org/web/20090626134412/http://www.rockefeller.edu/labheads/cohenje/PDFs/215BarrabasCohenalApp19941.pdf |archive-date=2009-06-26 }}</ref> is defined as
<math display="block">X^Y = e^{\log(X) \cdot Y}</math>
<math display="block">^Y\!X = e^{Y \cdot \log(X)}</math>
for any [[normal matrix|normal]] and [[Algebraic curve#Singularities|non-singular]] {{math|''n'' × ''n''}} matrix {{mvar|X}}, and any complex {{math|''n'' × ''n''}} matrix {{mvar|Y}}.

For matrix-matrix exponentials, there is a distinction between the left exponential {{mvar|<sup>Y</sup>X}} and the right exponential {{mvar|X<sup>Y</sup>}}, because the multiplication operator for matrix-to-matrix is not [[commutative]]. Moreover,
* If {{mvar|X}} is normal and non-singular, then {{mvar|X<sup>Y</sup>}} and {{mvar|<sup>Y</sup>X}} have the same set of eigenvalues.
* If {{mvar|X}} is normal and non-singular, {{mvar|Y}} is normal, and {{math|1=''XY'' = ''YX''}}, then {{math|1=''X<sup>Y</sup>'' = ''<sup>Y</sup>X''}}.
* If {{mvar|X}} is normal and non-singular, and {{mvar|X}}, {{mvar|Y}}, {{mvar|Z}} commute with each other, then {{math|1=''X''<sup>''Y''+''Z''</sup> = ''X<sup>Y</sup>''·''X<sup>Z</sup>''}} and {{math|1=<sup>''Y''+''Z''</sup>''X'' = ''<sup>Y</sup>X''·''<sup>Z</sup>X''}}.

== See also ==
{{Div col|colwidth=20em}}
* [[Matrix function]]
* [[Matrix logarithm]]
* [[C0-semigroup|C<sub>0</sub>-semigroup]]
* [[Exponential function]]
* [[Exponential map (Lie theory)]]
* [[Magnus expansion]]
* [[Derivative of the exponential map]]
* [[Vector flow]]
* [[Golden–Thompson inequality]]
* [[Phase-type distribution]]
* [[Lie product formula]]
* [[Baker–Campbell–Hausdorff formula]]
* [[Frobenius covariant]]
* [[Sylvester's formula]]
* [[Trigonometric functions of matrices]]
{{Div col end}}

== References ==
{{Reflist}}
* {{Citation| last=Hall|first=Brian C.|title=Lie groups, Lie algebras, and representations: An elementary introduction| edition=2nd|series=Graduate Texts in Mathematics|volume=222|publisher=Springer|year=2015|isbn=978-3-319-13466-6}}
* {{Cite book | last1=Horn | first1=Roger A. | last2=Johnson | first2=Charles R. | title=Topics in Matrix Analysis | publisher=[[Cambridge University Press]] | isbn=978-0-521-46713-1 | year=1991 }}.
* {{Cite journal | last1=Moler | first1=Cleve | author1-link=Cleve Moler | last2=Van Loan | first2=Charles F. | author2-link=Charles F. Van Loan | title=Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later | year=2003 | journal=SIAM Review| issn=1095-7200 | volume=45 | issue=1 | pages=3–49 | url=https://www.cs.cornell.edu/cv/files/2021/10/19ways.pdf | doi=10.1137/S00361445024180 | bibcode=2003SIAMR..45....3M | citeseerx=10.1.1.129.9283 }}.
* {{cite journal|author=Suzuki, Masuo| year=1985|title=Decomposition formulas of exponential operators and Lie exponentials with some applications to quantum mechanics and statistical physics|journal=Journal of Mathematical Physics| volume=26| issue=4 |pages=601–612|doi=10.1063/1.526596| bibcode=1985JMP....26..601S}}
* {{Cite journal | doi = 10.3842/SIGMA.2014.084|last1 = Curtright|last2 = Fairlie|last3 = Zachos |first1 = T L |first2 = D B |first3 = C K|author-link=Thomas Curtright|author-link2=David Fairlie|author-link3=Cosmas Zachos|year = 2014|title = A compact formula for rotations as spin matrix polynomials| journal =Symmetry, Integrability and Geometry: Methods and Applications| volume=10 | page=084|bibcode = 2014SIGMA..10..084C|arxiv=1402.3541|s2cid = 18776942}}
* {{cite book|first=Alston S.|last=Householder|title=The Theory of Matrices in Numerical Analysis |publisher=Dover Books on Mathematics| year=2006|author-link=Alston Scott Householder | isbn=978-0-486-44972-2}}
* {{cite journal|last=Van Kortryk|first=T. S.|title=Matrix exponentials, SU(N) group elements, and real polynomial roots | year=2016 | journal=Journal of Mathematical Physics| volume=57| issue=2| pages=021701| doi=10.1063/1.4938418| bibcode=2016JMP....57b1701V| arxiv=1508.05859|s2cid=119647937}}

== External links ==
* {{mathworld|urlname=MatrixExponential|title=Matrix Exponential}}

{{Matrix classes}}

[[Category:Matrix theory]]
[[Category:Lie groups]]
[[Category:Exponentials]]