Editing Cayley–Hamilton theorem (section)

===Matrix functions===
Given an [[analytic function]]
<math display="block">f(x) = \sum_{k=0}^\infty a_k x^k</math>
and the characteristic polynomial {{math|''p''(''x'')}} of degree {{math|''n''}} of an {{math|''n''&thinsp;×&thinsp;''n''}} matrix {{mvar|A}}, the function can be expressed using long division as 
<math display="block">f(x) = q(x) p(x) + r(x),</math>
where {{math|''q''(''x'')}} is some quotient polynomial and {{math|''r''(''x'')}} is a remainder polynomial such that {{math|0 ≤ deg ''r''(''x'') < ''n''}}.

By the Cayley–Hamilton theorem, replacing {{mvar|x}} by the matrix {{mvar|A}} gives {{math|1=''p''(''A'') = 0}}, so one has
<math display="block">f(A) = r(A). </math>

Thus, the analytic function of the matrix {{mvar|''A''}} can be expressed as a matrix polynomial of degree less than {{mvar|''n''}}.

Let the remainder polynomial be
<math display="block">r(x) = c_0 + c_1 x + \cdots + c_{n-1} x^{n-1}.</math>
Since {{math|1=''p''(''λ'') = 0}}, evaluating the function {{math|''f''(''x'')}} at the {{math|''n''}} eigenvalues of {{math|''A''}} yields
<math display="block"> f(\lambda_i) = r(\lambda_i) = c_0 + c_1 \lambda_i + \cdots + c_{n-1} \lambda_i^{n-1}, \qquad \text{for } i=1,2,...,n.</math>
This amounts to a system of {{math|''n''}} [[linear equation]]s, which can be solved to determine the coefficients {{math|''c<sub>i</sub>''}}. Thus, one has 
<math display="block">f(A) = \sum_{k=0}^{n-1} c_k A^k.</math>

When the eigenvalues are repeated, that is {{math|1=''λ<sub>i</sub> = λ<sub>j</sub>''}} for some {{math|''i ≠ j''}}, two or more equations are identical; and hence the linear equations cannot be solved uniquely. For such cases, for an eigenvalue {{math|''λ''}} with multiplicity {{math|''m''}}, the first {{math|''m'' – 1}} derivatives of {{math|''p''(''x'')}} vanish at the eigenvalue. This leads to the extra {{math|''m'' – 1}} linearly independent solutions 
<math display="block">\left.\frac{\mathrm{d}^k f(x)}{\mathrm{d}x^k}\right|_{x=\lambda} = \left.\frac{\mathrm{d}^k r(x)}{\mathrm{d}x^k}\right|_{x=\lambda}\qquad \text{for } k = 1, 2, \ldots, m-1,</math>
which, combined with others, yield the required {{math|''n''}} equations to solve for {{math|''c<sub>i</sub>''}}.

Finding a polynomial that passes through the points {{math|(''λ<sub>i</sub>'', &thinsp;''f''&thinsp;(''λ<sub>i</sub>''))}} is essentially an [[polynomial interpolation|interpolation problem]], and can be solved using [[Lagrange interpolation|Lagrange]] or [[Newton polynomial|Newton interpolation]] techniques, leading to [[Sylvester's formula]].

For example, suppose the task is to find the polynomial representation of 
<math display="block">f(A) = e^{At} \qquad \mathrm{where} \qquad A = \begin{pmatrix}1&2\\0&3\end{pmatrix}.</math>

The characteristic polynomial is {{math|1=''p''(''x'') = (''x'' − 1)(''x'' − 3) = ''x''<sup>2</sup> − 4''x'' + 3}}, and the eigenvalues are {{math|1=''λ'' = 1, 3}}. Let {{math|1=''r''(''x'') = ''c''<sub>0</sub> + ''c''<sub>1</sub>''x''}}. Evaluating {{math|1=''f''(''λ'') = ''r''(''λ'')}} at the eigenvalues, one obtains two linear equations, {{math|1=''e''<sup>''t''</sup> = ''c''<sub>0</sub> + ''c''<sub>1</sub>}} and {{math|1=''e''<sup>3''t''</sup> = ''c''<sub>0</sub> + 3''c''<sub>1</sub>}}.

Solving the equations yields {{math|1=''c''<sub>0</sub> = (3''e''<sup>''t''</sup> − ''e''<sup>3''t''</sup>)/2}} and {{math|1=''c''<sub>1</sub> = (''e''<sup>3''t''</sup> − ''e''<sup>''t''</sup>)/2}}. Thus, it follows that
<math display="block">e^{At} = c_0 I_2 + c_1 A = \begin{pmatrix}c_0 + c_1 & 2 c_1\\ 0 & c_0 + 3 c_1\end{pmatrix} = \begin{pmatrix}e^{t} & e^{3t} - e^{t} \\ 0 & e^{3t}\end{pmatrix}. </math>

If, instead, the function were {{math|1=''f''(''A'') = sin ''At''}}, then the coefficients would have been {{math|1=''c''<sub>0</sub> = (3 sin ''t'' − sin 3''t'')/2}} and {{math|1=''c''<sub>1</sub> = (sin 3''t'' − sin ''t'')/2}}; hence
<math display="block">\sin(At) = c_0 I_2 + c_1 A = \begin{pmatrix}\sin t & \sin 3t - \sin t \\ 0 & \sin 3t\end{pmatrix}.</math>

As a further example, when considering
<math display="block">f(A) = e^{At} \qquad \mathrm{where} \qquad A = \begin{pmatrix}0 & 1\\-1 & 0\end{pmatrix},</math>
then the characteristic polynomial is {{math|1=''p''(''x'') = ''x''<sup>2</sup>&nbsp;+&thinsp;1}}, and the eigenvalues are {{math|1=''λ'' = ±''i''}}.

As before, evaluating the function at the eigenvalues gives us the linear equations {{math|1=''e<sup>it</sup> = c<sub>0</sub> + i c<sub>1</sub>''}} and {{math|1=''e''<sup>−''it''</sup> = ''c''<sub>0</sub> − ''ic''<sub>1</sub>}}; the solution of which gives, {{math|1=''c''<sub>0</sub> = (''e''<sup>''it''</sup> + ''e''<sup>−''it''</sup>)/2 = cos&thinsp;''t''}} and {{math|1=''c''<sub>1</sub> = (''e''<sup>''it''</sup> − ''e''<sup>−''it''</sup>)/2''i'' = sin&thinsp;''t''}}. Thus, for this case,
<math display="block">e^{At} = (\cos t) I_2 + (\sin t) A = \begin{pmatrix}\cos t & \sin t\\ -\sin t & \cos t \end{pmatrix},</math>
which is a [[rotation matrix]].

Standard examples of such usage is the [[exponential map (Lie theory)|exponential map]] from the [[Lie algebra]] of a [[matrix Lie group]] into the group. It is given by a [[matrix exponential]],
<math display="block">\exp: \mathfrak g \rightarrow G;
\qquad tX \mapsto e^{tX} = \sum_{n=0}^\infty \frac{t^nX^n}{n!} = I + tX + \frac{t^2X^2}{2} + \cdots, t \in \mathbb R, X \in \mathfrak g .</math>
Such expressions have long been known for {{math|SU(2)}},
<math display="block">e^{i(\theta/2)(\hat\mathbf n \cdot \sigma)} = I_2 \cos \frac \theta 2 + i (\hat\mathbf n \cdot \sigma) \sin \frac \theta 2,</math>
where the {{mvar|σ}} are the [[Pauli matrices]] and for {{math|SO(3)}},
<math display="block">e^{i\theta(\hat\mathbf n \cdot \mathbf J)} = I_3 + i(\hat\mathbf n \cdot \mathbf J) \sin \theta + (\hat\mathbf n \cdot \mathbf J)^2 (\cos \theta - 1),</math>
which is [[Rodrigues' rotation formula]]. For the notation, see [[3D rotation group#A note on Lie algebras]].

More recently, expressions have appeared for other groups, like the [[Lorentz group]] {{math|SO(3, 1)}},<ref>{{harvnb|Zeni|Rodrigues|1992}}</ref> {{math|O(4, 2)}}<ref>{{harvnb|Barut|Zeni|Laufer|1994a}}</ref> and {{math|SU(2, 2)}},<ref>{{harvnb|Barut|Zeni|Laufer|1994b}}</ref> as well as {{math|GL(''n'', '''R''')}}.<ref>{{harvnb|Laufer|1997}}</ref> The group {{math|O(4, 2)}} is the [[conformal group]] of [[spacetime]], {{math|SU(2, 2)}} its [[simply connected]] cover (to be precise, the simply connected cover of the [[connected component (topology)|connected component]] {{math|SO<sup>+</sup>(4, 2)}} of {{math|O(4, 2)}}). The expressions obtained apply to the standard representation of these groups. They require knowledge of (some of) the [[eigenvalue]]s of the matrix to exponentiate. For {{math|SU(2)}} (and hence for {{math|SO(3)}}), closed expressions have been obtained for ''all'' irreducible representations, i.e. of any spin.<ref>{{harvnb|Curtright|Fairlie|Zachos|2014}}</ref>

[[File:GeorgFrobenius.jpg|220px|thumb|right|[[Ferdinand Georg Frobenius]] (1849–1917), German mathematician. His main interests were [[elliptic function]]s, [[differential equation]]s, and later [[group theory]].<br/>In 1878 he gave the first full proof of the Cayley&ndash;Hamilton theorem.<ref name="Frobenius 1878"/>]]