Editing Characteristic polynomial

{{Use American English|date = April 2019}}
{{short description|Polynomial whose roots are the eigenvalues of a matrix}}
{{About|the characteristic polynomial of a matrix or of an endomorphism of vector spaces|the characteristic polynomial of a matroid|Matroid|that of a graded poset|Graded poset}}

In [[linear algebra]], the '''characteristic polynomial''' of a [[square matrix]] is a [[polynomial]] which is [[Invariant (mathematics)|invariant]] under [[matrix similarity]] and has the [[eigenvalues]] as [[Root of a polynomial|roots]]. It has the [[determinant]] and the [[Trace (linear algebra)|trace]] of the matrix among its coefficients. The '''characteristic polynomial''' of an [[endomorphism]] of a finite-dimensional [[vector space]] is the characteristic polynomial of the matrix of that endomorphism over any basis (that is, the characteristic polynomial does not depend on the choice of a [[Basis (linear algebra)|basis]]). The '''characteristic equation''', also known as the '''determinantal equation''',<ref>{{cite book |last=Guillemin |first=Ernst |title=Introductory Circuit Theory |author-link=Ernst Guillemin |date=1953 |url=https://archive.org/details/introductorycirc0000guil |publisher=Wiley |pages=366, 541 |isbn=0471330663}}</ref><ref>{{cite journal |last1=Forsythe |first1=George E. |last2=Motzkin |first2=Theodore |date=January 1952 |title=An Extension of Gauss' Transformation for Improving the Condition of Systems of Linear Equations |url=https://www.ams.org/journals/mcom/1952-06-037/S0025-5718-1952-0048162-0/S0025-5718-1952-0048162-0.pdf |journal= Mathematics of Computation|volume=6 |issue=37 |pages=18–34 |doi=10.1090/S0025-5718-1952-0048162-0 |access-date=3 October 2020|doi-access=free }}</ref><ref>{{cite journal |last=Frank |first=Evelyn |date=1946 |title=On the zeros of polynomials with complex coefficients |journal=Bulletin of the American Mathematical Society |volume=52 |issue=2 |pages=144–157 |doi=10.1090/S0002-9904-1946-08526-2 |doi-access=free |url=https://projecteuclid.org/journals/bulletin-of-the-american-mathematical-society/volume-52/issue-2/On-the-zeros-of-polynomials-with-complex-coefficients/bams/1183507703.pdf }}</ref> is the equation obtained by equating the characteristic polynomial to zero.

In [[spectral graph theory]], the '''characteristic polynomial of a [[Graph (discrete mathematics)|graph]]''' is the characteristic polynomial of its [[adjacency matrix]].<ref>{{cite web
| url = http://mathworld.wolfram.com/CharacteristicPolynomial.html
| title = Characteristic Polynomial of a Graph – Wolfram MathWorld
|access-date = August 26, 2011}}</ref>

==Motivation==
In [[linear algebra]], [[eigenvalues and eigenvectors]] play a fundamental role, since, given a [[linear transformation]], an eigenvector is a vector whose direction is not changed by the  transformation, and the corresponding eigenvalue is the measure of the resulting change of magnitude of the vector.

More precisely, suppose the transformation is represented by a square matrix <math>A.</math> Then an eigenvector <math>\mathbf{v}</math> and the corresponding eigenvalue <math>\lambda</math> must satisfy the equation
<math display=block>A \mathbf{v} = \lambda \mathbf{v},</math>
or, equivalently (since  <math>\lambda \mathbf{v} = \lambda I \mathbf{v}</math>),
<math display=block>(\lambda I - A) \mathbf{v} =\mathbf 0</math>
where <math>I</math>  is the [[identity matrix]], and <math>\mathbf{v}\ne \mathbf{0}</math>
(although the zero vector satisfies this equation for every <math>\lambda,</math> it is not considered an eigenvector). 

It follows that the matrix <math>(\lambda I - A)</math> must be [[singular matrix|singular]], and its determinant
<math display=block>\det(\lambda I - A) = 0</math>
must be zero.

In other words, the eigenvalues of {{mvar|A}} are the [[zero of a function|roots]] of 
<math display=block>\det(xI - A),</math>
which is a [[monic polynomial]] in {{mvar|x}} of degree {{mvar|n}} if {{mvar|A}} is a {{math|''n''×''n''}} matrix. This polynomial is the ''characteristic polynomial'' of {{mvar|A}}.

==Formal definition==

Consider an <math>n \times n</math> matrix <math>A.</math> The characteristic polynomial of <math>A,</math> denoted by <math>p_A(t),</math> is the polynomial defined by<ref>{{Cite book|title=Advanced linear algebra |author=Steven Roman |url=https://archive.org/details/springer_10.1007-978-1-4757-2178-2 |isbn=3540978372 |edition=2 |year=1992 |publisher=Springer |page=[https://archive.org/details/springer_10.1007-978-1-4757-2178-2/page/n142 137]}}</ref>
<math display=block>p_A(t) = \det (t I - A)</math>
where <math>I</math> denotes the <math>n \times n</math> [[identity matrix]].

Some authors define the characteristic polynomial to be <math>\det(A - t I).</math> That polynomial differs from the one defined here by a sign <math>(-1)^n,</math> so it makes no difference for properties like having as roots the eigenvalues of <math>A</math>; however the definition above always gives a [[monic polynomial]], whereas the alternative definition is monic only when <math>n</math> is even.

==Examples==

To compute the characteristic polynomial of the matrix
<math display=block>A = \begin{pmatrix}
2 & 1\\
-1& 0
\end{pmatrix}.
</math>
the [[determinant]] of the following is computed:
<math display=block>t I-A = \begin{pmatrix}
t-2&-1\\
1&t-0
\end{pmatrix}
</math> 
and found to be <math>(t-2)t - 1(-1) = t^2-2t+1 \,\!,</math> the characteristic polynomial of <math>A.</math>

Another example uses [[hyperbolic function]]s of a [[hyperbolic angle]] &phi;.
For the matrix take
<math display=block>A = \begin{pmatrix} \cosh(\varphi) & \sinh(\varphi)\\ \sinh(\varphi)& \cosh(\varphi) \end{pmatrix}.</math>
Its characteristic polynomial is
<math display=block>\det (tI - A) = (t - \cosh(\varphi))^2 - \sinh^2(\varphi) = t^2 - 2 t \ \cosh(\varphi) + 1 = (t - e^\varphi) (t - e^{-\varphi}).</math>

==Properties==

The characteristic polynomial <math>p_A(t)</math> of a <math>n \times n</math> matrix is monic (its leading coefficient is <math>1</math>) and its degree is <math>n.</math>  The most important fact about the characteristic polynomial was already mentioned in the motivational paragraph: the eigenvalues of <math>A</math> are precisely the [[Root of a function|root]]s of <math>p_A(t)</math> (this also holds for the [[Minimal polynomial (linear algebra)|minimal polynomial]] of <math>A,</math> but its degree may be less than <math>n</math>).  All coefficients of the characteristic polynomial are [[polynomial expression]]s in the entries of the matrix. In particular its constant coefficient of <math>t^0</math> is <math>\det(-A) = (-1)^n \det(A),</math> the coefficient of <math>t^n</math> is one, and the coefficient of <math>t^{n-1}</math> is {{math|1=tr(−''A'') = −tr(''A'')}}, where {{math|tr(''A'')}} is the [[Trace (matrix)|trace]] of <math>A.</math> (The signs given here correspond to the formal definition given in the previous section; for the alternative definition these would instead be <math>\det(A)</math> and {{math|(−1)<sup>''n'' – 1 </sup>tr(''A'')}} respectively.<ref>Theorem 4 in these [https://www.math.ucla.edu/~tao/resource/general/115a.3.02f/week8.pdf lecture notes]</ref>)

For a <math>2 \times 2</math> matrix <math>A,</math> the characteristic polynomial is thus given by
<math display=block>t^2 - \operatorname{tr}(A) t + \det(A).</math>

Using the language of [[exterior algebra]], the characteristic polynomial of an <math>n \times n</math> matrix <math>A</math> may be expressed as 
<math display=block>p_A (t) = \sum_{k=0}^n t^{n-k} (-1)^k \operatorname{tr}\left(\textstyle\bigwedge^k A\right)</math>
where <math display="inline">\operatorname{tr}\left(\bigwedge^k A\right)</math> is the [[Trace (linear algebra)|trace]] of the <math>k</math>th [[Exterior algebra#Functoriality|exterior power]] of <math>A,</math> which has dimension <math display="inline">\binom {n}{k}.</math> This trace may be computed as the sum of all [[principal minor]]s of <math>A</math> of size <math>k.</math> The recursive [[Faddeev–LeVerrier algorithm]] computes these  coefficients  more efficiently {{Clarify|reason=More efficiently than what or who?|date=March 2025}}.

When the [[Characteristic (algebra)|characteristic]] of the [[Field (mathematics)|field]] of the coefficients is <math>0,</math> each such trace may alternatively be computed as a single determinant, that of the <math>k \times k</math> matrix,
<math display=block>\operatorname{tr}\left(\textstyle\bigwedge^k A\right) = \frac{1}{k!}  
\begin{vmatrix}  \operatorname{tr}A  &   k-1 &0&\cdots &0 \\
\operatorname{tr}A^2  &\operatorname{tr}A&  k-2 &\cdots &0 \\
 \vdots & \vdots & & \ddots & \vdots    \\
\operatorname{tr}A^{k-1} &\operatorname{tr}A^{k-2}& & \cdots & 1    \\ 
\operatorname{tr}A^k  &\operatorname{tr}A^{k-1}& & \cdots & \operatorname{tr}A
\end{vmatrix} ~.</math>

The [[Cayley–Hamilton theorem]] states that replacing <math>t</math> by <math>A</math> in the characteristic polynomial (interpreting the resulting powers as matrix powers, and the constant term <math>c</math> as <math>c</math> times the identity matrix) yields the zero matrix. Informally speaking, every matrix satisfies its own characteristic equation. This statement is equivalent to saying that the [[Minimal polynomial (linear algebra)|minimal polynomial]] of <math>A</math> divides the characteristic polynomial of <math>A.</math>

Two [[similar matrices]] have the same characteristic polynomial.  The converse however is not true in general: two matrices with the same characteristic polynomial need not be similar.

The matrix <math>A</math> and its [[transpose]] have the same characteristic polynomial. <math>A</math> is similar to a [[triangular matrix]] [[if and only if]] its characteristic polynomial can be completely factored into linear factors over <math>K</math> (the same is true with the minimal polynomial instead of the characteristic polynomial). In this case <math>A</math> is similar to a matrix in [[Jordan normal form]].

==Characteristic polynomial of a product of two matrices==

If <math>A</math> and <math>B</math> are two square <math>n \times n</math> matrices then characteristic polynomials of <math>AB</math> and <math>BA</math> coincide: 
<math display=block>p_{AB}(t)=p_{BA}(t).\,</math>

When <math>A</math> is [[Non-singular matrix|non-singular]] this result follows from the fact that <math>AB</math> and <math>BA</math> are [[Similar matrices|similar]]:
<math display=block>BA = A^{-1} (AB) A.</math>

For the case where both <math>A</math> and <math>B</math> are singular, the desired identity is an equality between polynomials in <math>t</math> and the coefficients of the matrices. Thus, to prove this equality, it suffices to prove that it is verified on a non-empty [[open subset]] (for the usual [[Topological space|topology]], or, more generally, for the [[Zariski topology]]) of the space of all the coefficients. As the non-singular matrices form such an open subset of the space of all matrices, this proves the result.

More generally, if <math>A</math> is a matrix of order <math>m \times n</math> and <math>B</math> is a matrix of order <math>n \times m,</math> then <math>AB</math> is <math>m \times m</math> and <math>BA</math> is <math>n \times n</math> matrix, and one has
<math display=block>p_{BA}(t) = t^{n-m} p_{AB}(t).\,</math>

To prove this, one may suppose <math>n > m,</math> by exchanging, if needed, <math>A</math> and <math>B.</math> Then, by bordering <math>A</math> on the bottom by <math>n - m</math> rows of zeros, and <math>B</math> on the right, by, <math>n - m</math> columns of zeros, one gets two <math>n \times n</math> matrices <math>A^{\prime}</math> and <math>B^{\prime}</math> such that <math>B^{\prime}A^{\prime} = BA</math> and <math>A^{\prime}B^{\prime}</math> is equal to <math>AB</math> bordered by <math>n - m</math> rows and columns of zeros. The result follows from the case of square matrices, by comparing the characteristic polynomials of <math>A^{\prime}B^{\prime}</math> and <math>AB.</math>

==Characteristic polynomial of ''A''<sup>''k''</sup>==

If <math>\lambda</math> is an eigenvalue of a square matrix <math>A</math> with eigenvector <math>\mathbf{v},</math> then <math>\lambda^k</math> is an eigenvalue of <math>A^k</math> because
<math display=block>A^k \textbf{v} = A^{k-1} A \textbf{v} = \lambda A^{k-1} \textbf{v} = \dots = \lambda^k \textbf{v}.</math>

The multiplicities can be shown to agree as well, and this generalizes to any polynomial in place of <math>x^k</math>:<ref>{{Cite book | last1=Horn | first1=Roger A. | last2=Johnson | first2=Charles R. | title=Matrix Analysis | publisher=[[Cambridge University Press]] | isbn=978-0-521-54823-6 | year=2013 |edition=2nd|at=pp. 108–109, Section 2.4.2}}</ref>
{{math theorem 
  | name = Theorem
  | Let <math>A</math> be a square <math>n \times n</math> matrix and let <math>f(t)</math> be a polynomial. If the characteristic polynomial of <math>A</math> has a factorization
<math display=block>p_A(t) = (t - \lambda_1) (t - \lambda_2) \cdots (t-\lambda_n)</math>
then the characteristic polynomial of the matrix <math>f(A)</math> is given by
<math display=block>p_{f(A)}(t) = (t - f(\lambda_1)) (t - f(\lambda_2)) \cdots (t-f(\lambda_n)).</math>
}}
That is, the algebraic multiplicity of <math>\lambda</math> in <math>f(A)</math> equals the sum of algebraic multiplicities of <math>\lambda'</math> in <math>A</math> over <math>\lambda'</math> such that <math>f(\lambda') = \lambda.</math> 
In particular, <math>\operatorname{tr}(f(A)) = \textstyle\sum_{i=1}^n f(\lambda_i)</math> and <math>\operatorname{det}(f(A)) = \textstyle\prod_{i=1}^n f(\lambda_i).</math> 
Here a polynomial <math>f(t) = t^3+1,</math> for example, is evaluated on a matrix <math>A</math> simply as <math>f(A) = A^3+I.</math> 

The theorem applies to matrices and polynomials over any field or [[commutative ring]].<ref>{{Cite book |last=Lang |first=Serge |url=https://www.worldcat.org/oclc/852792828 |title=Algebra |publisher=Springer |year=1993 |isbn=978-1-4613-0041-0 |location=New York |oclc=852792828|at=p.567, Theorem 3.10}}</ref>
However, the assumption that <math>p_A(t)</math> has a factorization into linear factors is not always true, unless the matrix is over an [[algebraically closed field]] such as the complex numbers.

{{math proof|proof=
This proof only applies to matrices and polynomials over complex numbers (or any algebraically closed field).
In that case, the characteristic polynomial of any square matrix can be always factorized as 
<math display=block>p_A(t) = \left(t - \lambda_1\right) \left(t - \lambda_2\right) \cdots \left(t - \lambda_n\right)</math>
where <math>\lambda_1, \lambda_2, \ldots, \lambda_n</math> are the eigenvalues of <math>A,</math> possibly repeated. 
Moreover, the [[Jordan normal form|Jordan decomposition theorem]] guarantees that any square matrix <math>A</math> can be decomposed as <math>A = S^{-1} U S,</math> where <math>S</math> is an [[invertible matrix]] and <math>U</math> is [[upper triangular]]
with <math>\lambda_1, \ldots, \lambda_n</math> on the diagonal (with each eigenvalue repeated according to its algebraic multiplicity).
(The Jordan normal form has stronger properties, but these are sufficient; alternatively the [[Schur decomposition]] can be used, which is less popular but somewhat easier to prove).

Let <math display="inline">f(t) = \sum_i \alpha_i t^i.</math> 
Then 
<math display=block>f(A) = \textstyle\sum \alpha_i (S^{-1} U S)^i = \textstyle\sum \alpha_i S^{-1} U S S^{-1} U S \cdots S^{-1} U S = \textstyle\sum \alpha_i S^{-1} U^i S = S^{-1} (\textstyle\sum \alpha_i U^i) S = S^{-1} f(U) S.</math> 
For an upper triangular matrix <math>U</math> with diagonal <math>\lambda_1, \dots, \lambda_n,</math> the matrix <math>U^i</math> is upper triangular with diagonal <math>\lambda_1^i,\dots,\lambda_n^i</math> in <math>U^i,</math> 
and hence <math>f(U)</math> is upper triangular with diagonal <math>f\left(\lambda_1\right), \dots, f\left(\lambda_n\right).</math> 
Therefore, the eigenvalues of <math>f(U)</math> are <math>f(\lambda_1),\dots,f(\lambda_n).</math> 
Since <math>f(A) = S^{-1} f(U) S</math> is [[Similar matrix|similar]] to <math>f(U),</math> it has the same eigenvalues, with the same algebraic multiplicities.
}}

==Secular function and secular equation==

===Secular function===

The term '''secular function''' has been used for what is now called ''characteristic polynomial'' (in some literature the term secular function is still used). The term comes from the fact that the characteristic polynomial was used to calculate [[Secular phenomena|secular perturbations]] (on a time scale of a century, that is, slow compared to annual motion) of planetary orbits, according to [[Joseph Louis Lagrange|Lagrange]]'s theory of oscillations.

===Secular equation===
''Secular equation'' may have several meanings.

* In [[linear algebra]] it is sometimes used in place of characteristic equation.
* In [[astronomy]] it is the algebraic or numerical expression of the magnitude of the inequalities in a planet's motion that remain after the inequalities of a short period have been allowed for.<ref>{{cite web
| url = http://dict.die.net/secular%20equation/
| title = secular equation
|access-date = January 21, 2010}}</ref>

* In [[molecular orbital]] calculations relating to the energy of the electron and its wave function it is also used instead of the characteristic equation.

==For general associative algebras==
The above definition of the characteristic polynomial of a matrix <math>A \in M_n(F)</math> with entries in a field <math>F</math> generalizes without any changes to the case when <math>F</math> is just a [[commutative ring]]. {{harvtxt|Garibaldi|2004}} defines the characteristic polynomial for elements of an arbitrary finite-dimensional ([[associative algebra|associative]], but not necessarily commutative) algebra over a field <math>F</math> and proves the standard properties of the characteristic polynomial in this generality.

==See also==

* [[Characteristic equation (disambiguation)]]
* [[Invariants of tensors]]
* [[Companion matrix]]
* [[Faddeev–LeVerrier algorithm]]
* [[Cayley–Hamilton theorem]]
* [[Samuelson–Berkowitz algorithm]]

==References==

{{reflist}}

* T.S. Blyth & E.F. Robertson (1998) ''Basic Linear Algebra'', p 149, Springer {{ISBN|3-540-76122-5}} .
* John B. Fraleigh & Raymond A. Beauregard (1990) ''Linear Algebra'' 2nd edition, p 246, [[Addison-Wesley]] {{ISBN|0-201-11949-8}} .
* {{Citation|last=Garibaldi|first=Skip|author-link=Skip Garibaldi|title=The characteristic polynomial and determinant are not ad hoc constructions|journal=American Mathematical Monthly|volume=111|year=2004|issue=9|pages=761&ndash;778|doi=10.2307/4145188|jstor=4145188|mr=2104048|arxiv=math/0203276}}
* Werner Greub (1974) ''Linear Algebra'' 4th edition, pp 120&ndash;5, Springer, {{ISBN|0-387-90110-8}} .
* Paul C. Shields (1980) ''Elementary Linear Algebra'' 3rd edition, p 274, [[Worth Publishers]] {{ISBN|0-87901-121-1}} .
* [[Gilbert Strang]] (1988) ''Linear Algebra and Its Applications'' 3rd edition, p 246, [[Brooks/Cole]] {{ISBN|0-15-551005-3}} .

[[Category:Polynomials]]
[[Category:Linear algebra]]
[[Category:Tensors]]