Editing Cayley–Hamilton theorem (section)

=== A proof using matrices of endomorphisms ===
As was mentioned above, the matrix ''p''(''A'') in statement of the theorem is obtained by first evaluating the determinant and then substituting the matrix ''A'' for ''t''; doing that substitution into the matrix <math>t I_n - A</math> before evaluating the determinant is not meaningful. Nevertheless, it is possible to give an interpretation where {{math|''p''(''A'')}} is obtained directly as the value of a certain determinant, but this requires a more complicated setting, one of matrices over a ring in which one can interpret both the entries <math>A_{i,j}</math> of {{math|''A''}}, and all of {{math|''A''}} itself. One could take for this the ring {{math|''M''(''n'', ''R'')}} of {{math|''n''&thinsp;×&thinsp;''n''}} matrices over {{math|''R''}}, where the entry <math>A_{i,j}</math> is realised as <math>A_{i,j} I_n</math>, and {{math|''A''}} as itself. But considering matrices with matrices as entries might cause confusion with [[block matrix|block matrices]], which is not intended, as that gives the wrong notion of determinant (recall that the determinant of a matrix is defined as a sum of products of its entries, and in the case of a block matrix this is generally not the same as the corresponding sum of products of its blocks!). It is clearer to distinguish {{math|''A''}} from the [[endomorphism]] {{math|''φ''}} of an {{mvar|n}}-[[dimension (vector space)|dimensional]] [[vector space]] ''V'' (or [[free module|free {{math|''R''}}-module]] if {{math|''R''}} is not a field) defined by it in a basis <math>e_1, \ldots, e_n</math>, and to take matrices over the ring End(''V'') of all such endomorphisms. Then {{math|''φ'' ∈ End(''V'')}} is a possible matrix entry, while {{mvar|A}} designates the element of {{math|''M''(''n'', End(''V''))}} whose {{math|''i'',&thinsp;''j''}} entry is endomorphism of scalar multiplication by <math>A_{i,j}</math>; similarly <math>I_n</math> will be interpreted as element of {{math|''M''(''n'', End(''V''))}}. However, since {{math|End(''V'')}} is not a commutative ring, no determinant is defined on {{math|''M''(''n'', End(''V''))}}; this can only be done for matrices over a commutative subring of {{math|End(''V'')}}. Now the entries of the matrix <math>\varphi I_n-A</math> all lie in the subring {{math|''R''[''φ'']}} generated by the identity and {{math|''φ''}}, which is commutative. Then a determinant map {{math|''M''(''n'', ''R''[''φ'']) → ''R''[''φ'']}} is defined, and <math>\det(\varphi I_n - A)</math> evaluates to the value {{math|''p''(''φ'')}} of the characteristic polynomial of {{math|''A''}} at {{math|''φ''}} (this holds independently of the relation between {{math|''A''}} and {{math|''φ''}}); the Cayley–Hamilton theorem states that {{math|''p''(''φ'')}} is the null endomorphism.

In this form, the following proof can be obtained from that of {{harvtxt|Atiyah|MacDonald|1969|loc=Prop. 2.4}} (which in fact is the more general statement related to the [[Nakayama lemma]]; one takes for the [[ideal (ring theory)|ideal]] in that proposition the whole ring {{math|''R''}}). The fact that {{math|''A''}} is the matrix of {{math|''φ''}} in the basis {{math|''e''<sub>1</sub>, ..., ''e''<sub>''n''</sub>}} means that
<math display="block">\varphi(e_i) = \sum_{j = 1}^n A_{j,i} e_j \quad\text{for }i=1,\ldots,n.</math>
One can interpret these as {{math|''n''}} components of one equation in {{math|''V''{{i sup|''n''}}}}, whose members can be written using the matrix-vector product {{math|''M''(''n'', End(''V'')) × ''V''{{i sup|''n''}} → ''V''{{i sup|''n''}}}} that is defined as usual, but with individual entries {{math|''ψ'' ∈ End(''V'')}} and {{math|''v''}} in {{math|''V''}} being "multiplied" by forming <math>\psi(v)</math>; this gives:
<math display="block">\varphi I_n \cdot E = A^\operatorname{tr}\cdot E,</math>
where <math>E\in V^n</math> is the element whose component {{math|''i''}} is {{math|''e''<sub>''i''</sub>}} (in other words it is the basis {{math|''e''<sub>1</sub>, ..., ''e''<sub>''n''</sub>}} of {{math|''V''}} written as a column of vectors). Writing this equation as
<math display="block">(\varphi I_n-A^\operatorname{tr})\cdot E = 0\in V^n</math>
one recognizes the [[transpose]] of the matrix <math>\varphi I_n-A</math> considered above, and its determinant (as element of {{math|''M''(''n'', ''R''[''φ'']))}} is also ''p''(''φ''). To derive from this equation that {{math|1=''p''(''φ'') = 0 ∈ End(''V'')}}, one left-multiplies by the [[adjugate matrix]] of <math>\varphi I_n-A^\operatorname{tr}</math>, which is defined in the matrix ring {{math|''M''(''n'', ''R''[''φ''])}}, giving
<math display="block">\begin{align}
0 &= \operatorname{adj}(\varphi I_n-A^\operatorname{tr}) \cdot \left((\varphi I_n-A^\operatorname{tr})\cdot E\right) \\[1ex]
  &= \left(\operatorname{adj}(\varphi I_n-A^\operatorname{tr}) \cdot (\varphi I_n-A^\operatorname{tr})\right) \cdot E \\[1ex]
  &= \left(\det(\varphi I_n - A^\operatorname{tr})I_n\right) \cdot E \\[1ex]
  &= (p(\varphi)I_n)\cdot E;
\end{align}</math>
the [[associativity]] of matrix-matrix and matrix-vector multiplication used in the first step is a purely formal property of those operations, independent of the nature of the entries. Now component {{math|''i''}} of this equation says that {{math|1=''p''(''φ'')(''e<sub>i</sub>'') = 0 ∈ ''V''}}; thus {{math|''p''(''φ'')}} vanishes on all {{math|''e''<sub>''i''</sub>}}, and since these elements generate {{math|''V''}} it follows that {{math|1=''p''(''φ'') = 0 ∈ End(''V'')}}, completing the proof.

One additional fact that follows from this proof is that the matrix {{math|''A''}} whose characteristic polynomial is taken need not be identical to the value {{math|''φ''}} substituted into that polynomial; it suffices that {{math|''φ''}} be an endomorphism of {{math|''V''}} satisfying the initial equations

<math display="block">\varphi(e_i) = \sum_j A_{j,i} e_j</math>
for ''some'' sequence of elements {{math|''e''<sub>1</sub>, ..., ''e''<sub>''n''</sub>}} that generate {{math|''V''}} (which space might have smaller dimension than {{mvar|n}}, or in case the ring {{math|''R''}} is not a field it might not be a [[free module]] at all).