Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Cayley–Hamilton theorem
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Matrix functions=== Given an [[analytic function]] <math display="block">f(x) = \sum_{k=0}^\infty a_k x^k</math> and the characteristic polynomial {{math|''p''(''x'')}} of degree {{math|''n''}} of an {{math|''n'' × ''n''}} matrix {{mvar|A}}, the function can be expressed using long division as <math display="block">f(x) = q(x) p(x) + r(x),</math> where {{math|''q''(''x'')}} is some quotient polynomial and {{math|''r''(''x'')}} is a remainder polynomial such that {{math|0 ≤ deg ''r''(''x'') < ''n''}}. By the Cayley–Hamilton theorem, replacing {{mvar|x}} by the matrix {{mvar|A}} gives {{math|1=''p''(''A'') = 0}}, so one has <math display="block">f(A) = r(A). </math> Thus, the analytic function of the matrix {{mvar|''A''}} can be expressed as a matrix polynomial of degree less than {{mvar|''n''}}. Let the remainder polynomial be <math display="block">r(x) = c_0 + c_1 x + \cdots + c_{n-1} x^{n-1}.</math> Since {{math|1=''p''(''λ'') = 0}}, evaluating the function {{math|''f''(''x'')}} at the {{math|''n''}} eigenvalues of {{math|''A''}} yields <math display="block"> f(\lambda_i) = r(\lambda_i) = c_0 + c_1 \lambda_i + \cdots + c_{n-1} \lambda_i^{n-1}, \qquad \text{for } i=1,2,...,n.</math> This amounts to a system of {{math|''n''}} [[linear equation]]s, which can be solved to determine the coefficients {{math|''c<sub>i</sub>''}}. Thus, one has <math display="block">f(A) = \sum_{k=0}^{n-1} c_k A^k.</math> When the eigenvalues are repeated, that is {{math|1=''λ<sub>i</sub> = λ<sub>j</sub>''}} for some {{math|''i ≠ j''}}, two or more equations are identical; and hence the linear equations cannot be solved uniquely. For such cases, for an eigenvalue {{math|''λ''}} with multiplicity {{math|''m''}}, the first {{math|''m'' – 1}} derivatives of {{math|''p''(''x'')}} vanish at the eigenvalue. This leads to the extra {{math|''m'' – 1}} linearly independent solutions <math display="block">\left.\frac{\mathrm{d}^k f(x)}{\mathrm{d}x^k}\right|_{x=\lambda} = \left.\frac{\mathrm{d}^k r(x)}{\mathrm{d}x^k}\right|_{x=\lambda}\qquad \text{for } k = 1, 2, \ldots, m-1,</math> which, combined with others, yield the required {{math|''n''}} equations to solve for {{math|''c<sub>i</sub>''}}. Finding a polynomial that passes through the points {{math|(''λ<sub>i</sub>'',  ''f'' (''λ<sub>i</sub>''))}} is essentially an [[polynomial interpolation|interpolation problem]], and can be solved using [[Lagrange interpolation|Lagrange]] or [[Newton polynomial|Newton interpolation]] techniques, leading to [[Sylvester's formula]]. For example, suppose the task is to find the polynomial representation of <math display="block">f(A) = e^{At} \qquad \mathrm{where} \qquad A = \begin{pmatrix}1&2\\0&3\end{pmatrix}.</math> The characteristic polynomial is {{math|1=''p''(''x'') = (''x'' − 1)(''x'' − 3) = ''x''<sup>2</sup> − 4''x'' + 3}}, and the eigenvalues are {{math|1=''λ'' = 1, 3}}. Let {{math|1=''r''(''x'') = ''c''<sub>0</sub> + ''c''<sub>1</sub>''x''}}. Evaluating {{math|1=''f''(''λ'') = ''r''(''λ'')}} at the eigenvalues, one obtains two linear equations, {{math|1=''e''<sup>''t''</sup> = ''c''<sub>0</sub> + ''c''<sub>1</sub>}} and {{math|1=''e''<sup>3''t''</sup> = ''c''<sub>0</sub> + 3''c''<sub>1</sub>}}. Solving the equations yields {{math|1=''c''<sub>0</sub> = (3''e''<sup>''t''</sup> − ''e''<sup>3''t''</sup>)/2}} and {{math|1=''c''<sub>1</sub> = (''e''<sup>3''t''</sup> − ''e''<sup>''t''</sup>)/2}}. Thus, it follows that <math display="block">e^{At} = c_0 I_2 + c_1 A = \begin{pmatrix}c_0 + c_1 & 2 c_1\\ 0 & c_0 + 3 c_1\end{pmatrix} = \begin{pmatrix}e^{t} & e^{3t} - e^{t} \\ 0 & e^{3t}\end{pmatrix}. </math> If, instead, the function were {{math|1=''f''(''A'') = sin ''At''}}, then the coefficients would have been {{math|1=''c''<sub>0</sub> = (3 sin ''t'' − sin 3''t'')/2}} and {{math|1=''c''<sub>1</sub> = (sin 3''t'' − sin ''t'')/2}}; hence <math display="block">\sin(At) = c_0 I_2 + c_1 A = \begin{pmatrix}\sin t & \sin 3t - \sin t \\ 0 & \sin 3t\end{pmatrix}.</math> As a further example, when considering <math display="block">f(A) = e^{At} \qquad \mathrm{where} \qquad A = \begin{pmatrix}0 & 1\\-1 & 0\end{pmatrix},</math> then the characteristic polynomial is {{math|1=''p''(''x'') = ''x''<sup>2</sup> + 1}}, and the eigenvalues are {{math|1=''λ'' = ±''i''}}. As before, evaluating the function at the eigenvalues gives us the linear equations {{math|1=''e<sup>it</sup> = c<sub>0</sub> + i c<sub>1</sub>''}} and {{math|1=''e''<sup>−''it''</sup> = ''c''<sub>0</sub> − ''ic''<sub>1</sub>}}; the solution of which gives, {{math|1=''c''<sub>0</sub> = (''e''<sup>''it''</sup> + ''e''<sup>−''it''</sup>)/2 = cos ''t''}} and {{math|1=''c''<sub>1</sub> = (''e''<sup>''it''</sup> − ''e''<sup>−''it''</sup>)/2''i'' = sin ''t''}}. Thus, for this case, <math display="block">e^{At} = (\cos t) I_2 + (\sin t) A = \begin{pmatrix}\cos t & \sin t\\ -\sin t & \cos t \end{pmatrix},</math> which is a [[rotation matrix]]. Standard examples of such usage is the [[exponential map (Lie theory)|exponential map]] from the [[Lie algebra]] of a [[matrix Lie group]] into the group. It is given by a [[matrix exponential]], <math display="block">\exp: \mathfrak g \rightarrow G; \qquad tX \mapsto e^{tX} = \sum_{n=0}^\infty \frac{t^nX^n}{n!} = I + tX + \frac{t^2X^2}{2} + \cdots, t \in \mathbb R, X \in \mathfrak g .</math> Such expressions have long been known for {{math|SU(2)}}, <math display="block">e^{i(\theta/2)(\hat\mathbf n \cdot \sigma)} = I_2 \cos \frac \theta 2 + i (\hat\mathbf n \cdot \sigma) \sin \frac \theta 2,</math> where the {{mvar|σ}} are the [[Pauli matrices]] and for {{math|SO(3)}}, <math display="block">e^{i\theta(\hat\mathbf n \cdot \mathbf J)} = I_3 + i(\hat\mathbf n \cdot \mathbf J) \sin \theta + (\hat\mathbf n \cdot \mathbf J)^2 (\cos \theta - 1),</math> which is [[Rodrigues' rotation formula]]. For the notation, see [[3D rotation group#A note on Lie algebras]]. More recently, expressions have appeared for other groups, like the [[Lorentz group]] {{math|SO(3, 1)}},<ref>{{harvnb|Zeni|Rodrigues|1992}}</ref> {{math|O(4, 2)}}<ref>{{harvnb|Barut|Zeni|Laufer|1994a}}</ref> and {{math|SU(2, 2)}},<ref>{{harvnb|Barut|Zeni|Laufer|1994b}}</ref> as well as {{math|GL(''n'', '''R''')}}.<ref>{{harvnb|Laufer|1997}}</ref> The group {{math|O(4, 2)}} is the [[conformal group]] of [[spacetime]], {{math|SU(2, 2)}} its [[simply connected]] cover (to be precise, the simply connected cover of the [[connected component (topology)|connected component]] {{math|SO<sup>+</sup>(4, 2)}} of {{math|O(4, 2)}}). The expressions obtained apply to the standard representation of these groups. They require knowledge of (some of) the [[eigenvalue]]s of the matrix to exponentiate. For {{math|SU(2)}} (and hence for {{math|SO(3)}}), closed expressions have been obtained for ''all'' irreducible representations, i.e. of any spin.<ref>{{harvnb|Curtright|Fairlie|Zachos|2014}}</ref> [[File:GeorgFrobenius.jpg|220px|thumb|right|[[Ferdinand Georg Frobenius]] (1849–1917), German mathematician. His main interests were [[elliptic function]]s, [[differential equation]]s, and later [[group theory]].<br/>In 1878 he gave the first full proof of the Cayley–Hamilton theorem.<ref name="Frobenius 1878"/>]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)