Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Companion matrix
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Square matrix constructed from a monic polynomial}} In [[linear algebra]], the [[Ferdinand Georg Frobenius|Frobenius]] '''companion matrix''' of the [[monic polynomial]] <math display="block"> p(x)=c_0 + c_1 x + \cdots + c_{n-1}x^{n-1} + x^n </math> is the [[square matrix]] defined as <math display="block">C(p)=\begin{bmatrix} 0 & 0 & \dots & 0 & -c_0 \\ 1 & 0 & \dots & 0 & -c_1 \\ 0 & 1 & \dots & 0 & -c_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \dots & 1 & -c_{n-1} \end{bmatrix}.</math> Some authors use the [[transpose]] of this matrix, <math> C(p)^T </math>, which is more convenient for some purposes such as linear [[recurrence relation]]s ([[#Linear recursive sequences|see below]]). <math> C(p) </math> is defined from the coefficients of <math>p(x)</math>, while the [[characteristic polynomial]] as well as the [[minimal polynomial (linear algebra)|minimal polynomial]] of <math> C(p) </math> are equal to <math>p(x) </math>.<ref> {{Cite book |last=Horn |first=Roger A. |url=https://books.google.com/books?id=f6_r93Of544C&dq=%22companion+matrix%22&pg=PA147 |title=Matrix Analysis |author2=Charles R. Johnson |publisher=Cambridge University Press |year=1985 |isbn=0-521-30586-1 |location=Cambridge, UK |pages=146–147 |accessdate=2010-02-10}}</ref> In this sense, the matrix <math> C(p) </math> and the polynomial <math>p(x)</math> are "companions". ==Similarity to companion matrix== Any matrix {{mvar|A}} with entries in a [[field (mathematics)|field]] {{mvar|F}} has characteristic polynomial <math> p(x) = \det(xI - A) </math>, which in turn has companion matrix <math> C(p) </math>. These matrices are related as follows. The following statements are equivalent: * ''A'' is [[similar (linear algebra)|similar]] over ''F'' to <math> C(p) </math>, i.e. ''A'' can be conjugated to its companion matrix by matrices in GL''<sub>n</sub>''(''F''); * the characteristic polynomial <math> p(x) </math> coincides with the minimal polynomial of ''A'' , i.e. the minimal polynomial has degree ''n''; * the linear mapping <math>A:F^n\to F^n</math> makes <math>F^n</math> a [[Cyclic module|cyclic]] <math>F[A]</math>-module, having a basis of the form <math>\{v,Av,\ldots,A^{n-1}v\} </math>; or equivalently <math>F^n \cong F[X]/(p(x))</math> as <math>F[A]</math>-modules. If the above hold, one says that ''A'' is ''non-derogatory''. Not every square matrix is similar to a companion matrix, but every square matrix is similar to a [[block diagonal]] matrix made of companion matrices. If we also demand that the polynomial of each diagonal block divides the next one, they are uniquely determined by ''A'', and this gives the [[Frobenius normal form|rational canonical form]] of ''A''. ==Diagonalizability== The roots of the characteristic polynomial <math>p(x)</math> are the [[eigenvalue]]s of <math> C(p) </math>. If there are ''n'' distinct eigenvalues <math>\lambda_1,\ldots,\lambda_n</math>, then <math> C(p) </math> is [[diagonalizable]] as <math> C(p) = V^{-1}\!D V</math>, where ''D'' is the diagonal matrix and ''V'' is the [[Vandermonde matrix]] corresponding to the {{mvar|λ}}'s: <math display="block"> D = \begin{bmatrix} \lambda_1 & 0 & \!\!\!\cdots\!\!\! & 0\\ 0 & \lambda_2 & \!\!\!\cdots\!\!\! & 0\\ 0 & 0 & \!\!\!\cdots\!\!\! & \lambda_n \end{bmatrix}, \qquad V = \begin{bmatrix} 1 & \lambda_1 & \lambda_1^2 & \!\!\!\cdots\!\!\! & \lambda_1^{n-1}\\ 1 & \lambda_2 & \lambda_2^2 & \!\!\!\cdots\!\!\! & \lambda_2^{n-1}\\[-1em] \vdots & \vdots & \vdots & \!\!\!\ddots\!\!\! &\vdots \\ 1 & \lambda_n & \lambda_n^2 & \!\!\!\cdots\!\!\! & \lambda_n^{n-1} \end{bmatrix}. </math> Indeed, a reasonably hard computation shows that the transpose <math> C(p)^T </math> has eigenvectors <math> v_i=(1,\lambda_i,\ldots,\lambda_i ^{n-1}) </math> with <math> C(p)^T\!(v_i) = \lambda_i v_i </math>, which follows from <math> p(\lambda_i)=c_0+c_1\lambda_i+\cdots+c_{n-1}\lambda_i^{n-1}+\lambda_i ^n = 0 </math>. Thus, its diagonalizing [[change of basis]] matrix is <math> V^T = [v_1^T \ldots v_n^T] </math>, meaning <math> C(p)^T = V^T D\, (V^T) ^{-1} </math>, and taking the transpose of both sides gives <math> C(p) = V^{-1}\!D V</math>. We can read the eigenvectors of <math> C(p) </math> with <math> C(p)(w_i) = \lambda_i w_i </math> from the equation <math> C(p) = V^{-1}\!D V</math>: they are the column vectors of the [[Vandermonde matrix#Inverse Vandermonde Matrix|inverse Vandermonde matrix]] <math> V^{-1} = [w_1^T \cdots w_n^T] </math>. This matrix is known explicitly, giving the eigenvectors <math> w_i = (L_{0i}, \ldots, L_{(n-1)i}) </math>, with coordinates equal to the coefficients of the [[Lagrange polynomial|Lagrange polynomials]] <math display="block"> L_i(x) = L_{0i} + L_{1i}x + \cdots + L_{(n-1)i}x^{n-1} = \prod_{j\neq i} \frac{x-\lambda_j}{\lambda_j-\lambda_i} = \frac{p(x)}{(x-\lambda_i)\,p'(\lambda_i)}. </math> Alternatively, the scaled eigenvectors <math> \tilde w_i = p'\!(\lambda_i)\,w_i </math> have simpler coefficients. If <math> p(x) </math> has multiple roots, then <math> C(p) </math> is not diagonalizable. Rather, the [[Jordan canonical form]] of <math> C(p) </math> contains one [[Jordan block]] for each distinct root; if the multiplicity of the root is ''m'', then the block is an ''m'' × ''m'' matrix with <math> \lambda </math> on the diagonal and 1 in the entries just above the diagonal. in this case, ''V'' becomes a [[Vandermonde matrix#Confluent Vandermonde matrices|confluent Vandermonde matrix]].<ref> {{cite book | last1 = Turnbull | first1 = H. W. | last2 = Aitken | first2 = A. C. | date = 1961 | title = An Introduction to the Theory of Canonical Matrices | location = New York | publisher = Dover | page = 60 | isbn = 978-0486441689 }} </ref> ==Linear recursive sequences== A [[linear recursive sequence]] defined by <math>a_{k+n} = - c_0 a_k - c_1 a_{k+1} \cdots - c_{n-1} a_{k+n-1}</math> for <math>k \geq 0</math> has the characteristic polynomial <math>p(x)=c_0 + c_1 x + \cdots + c_{n-1}x^{n-1} + x^n </math>, whose transpose companion matrix <math> C(p)^T </math> generates the sequence: <math display="block">\begin{bmatrix}a_{k+1}\\ a_{k+2}\\ \vdots \\ a_{k+n-1}\\ a_{k+n} \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 & \cdots & 0\\ 0 & 0 & 1 & \cdots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1\\ -c_0 & -c_1 & -c_2 & \cdots & -c_{n-1} \end{bmatrix} \begin{bmatrix}a_k\\ a_{k+1}\\ \vdots \\ a_{k+n-2}\\ a_{k+n-1} \end{bmatrix} .</math> The vector <math>v=(1,\lambda,\lambda^2,\ldots,\lambda^{n-1})</math> is an eigenvector of this matrix, where the eigenvalue <math>\lambda</math> is a root of <math>p(x)</math>. Setting the initial values of the sequence equal to this vector produces a geometric sequence <math>a_k = \lambda^k</math> which satisfies the recurrence. In the case of ''n'' distinct eigenvalues, an arbitrary solution <math>a_k </math> can be written as a linear combination of such geometric solutions, and the eigenvalues of largest complex norm give an [[asymptotic approximation]]. ==From linear ODE to first-order linear ODE system== Similarly to the above case of linear recursions, consider a homogeneous [[linear ODE]] of order ''n'' for the scalar function <math>y = y(t)</math>: <math display="block"> y^{(n)} + c_{n-1}y^{(n-1)} + \dots + c_{1}y^{(1)} + c_0 y = 0 . </math> This can be equivalently described as a coupled system of homogeneous linear ODE of order 1 for the vector function <math>z(t) = (y(t),y'(t),\ldots,y^{(n-1)}(t))</math>: <math display="block">z' = C(p)^T z</math> where <math>C(p)^T</math> is the transpose companion matrix for the characteristic polynomial <math display="block">p(x)=x^n + c_{n-1}x^{n-1} + \cdots + c_1 x + c_0 .</math> Here the coefficients <math>c_i = c_i(t)</math> may be also functions, not just constants. If <math>C(p)^T</math> is diagonalizable, then a diagonalizing change of basis will transform this into a decoupled system equivalent to one scalar homogeneous first-order linear ODE in each coordinate. An inhomogeneous equation <math display="block"> y^{(n)} + c_{n-1}y^{(n-1)} + \dots + c_{1}y^{(1)} + c_0 y = f(t) </math> is equivalent to the system: <math display="block"> z' = C(p)^T z + F(t) </math> with the inhomogeneity term <math>F(t) = (0,\ldots,0,f(t)) </math>. Again, a diagonalizing change of basis will transform this into a decoupled system of scalar inhomogeneous first-order linear ODEs. == Cyclic shift matrix == In the case of <math>p(x)=x^n-1</math>, when the eigenvalues are the complex [[Root of unity|roots of unity]], the companion matrix and its transpose both reduce to Sylvester's cyclic [[Generalizations of Pauli matrices#Construction: The clock and shift matrices|shift matrix]], a [[circulant matrix]]. == Multiplication map on a simple field extension == Consider a polynomial <math>p(x)=x^n + c_{n-1}x^{n-1} + \cdots + c_1 x + c_0 </math> with coefficients in a [[Field (mathematics)|field]] <math>F</math>, and suppose <math>p(x)</math> is [[Irreducible polynomial|irreducible]] in the [[polynomial ring]] <math>F[x]</math>. Then [[Simple extension|adjoining a root]] <math>\lambda</math> of <math>p(x)</math> produces a [[field extension]] <math>K=F(\lambda) \cong F[x]/(p(x))</math>, which is also a vector space over <math>F</math> with standard basis <math>\{1,\lambda,\lambda^2,\ldots,\lambda^{n-1}\} </math>. Then the <math>F</math>-linear multiplication mapping {{ block indent | em = 1.5 | text = <math>m_{\lambda}:K\to K</math> defined by <math>m_\lambda(\alpha) = \lambda\alpha</math> }} has an ''n'' × ''n'' matrix <math>[m_\lambda]</math> with respect to the standard basis. Since <math>m_\lambda(\lambda^i) = \lambda^{i+1}</math> and <math>m_\lambda(\lambda^{n-1}) = \lambda^n = -c_0-\cdots-c_{n-1}\lambda^{n-1}</math>, this is the companion matrix of <math>p(x)</math>: <math display="block">[m_\lambda] = C(p).</math> Assuming this extension is [[Separable extension|separable]] (for example if <math>F</math> has [[characteristic zero]] or is a [[finite field]]), <math>p(x)</math> has distinct roots <math>\lambda_1,\ldots,\lambda_n </math> with <math>\lambda_1 = \lambda</math>, so that <math display="block">p(x)=(x-\lambda_1)\cdots (x-\lambda_n),</math> and it has [[splitting field]] <math>L = F(\lambda_1,\ldots,\lambda_n)</math>. Now <math>m_\lambda</math> is not diagonalizable over <math>F</math>; rather, we must [[Extension of scalars|extend]] it to an <math>L</math>-linear map on <math>L^n \cong L\otimes_F K</math>, a vector space over <math>L</math> with standard basis <math>\{1{\otimes} 1, \, 1{\otimes}\lambda, \, 1{\otimes}\lambda^2,\ldots,1{\otimes}\lambda^{n-1}\} </math>, containing vectors <math>w = (\beta_1,\ldots,\beta_n) = \beta_1{\otimes} 1+\cdots+\beta_n{\otimes} \lambda^{n-1} </math>. The extended mapping is defined by <math>m_\lambda(\beta\otimes\alpha) = \beta\otimes(\lambda\alpha)</math>. The matrix <math>[m_\lambda] = C(p)</math> is unchanged, but as above, it can be diagonalized by matrices with entries in <math>L</math>: <math display="block">[m_\lambda]=C(p)= V^{-1}\! D V,</math> for the diagonal matrix <math>D=\operatorname{diag}(\lambda_1,\ldots,\lambda_n)</math> and the [[Vandermonde matrix]] ''V'' corresponding to <math>\lambda_1,\ldots,\lambda_n\in L </math>. The explicit formula for the eigenvectors (the scaled column vectors of the [[Vandermonde matrix#Inverse Vandermonde matrix|inverse Vandermonde matrix]] <math>V^{-1}</math>) can be written as: <math display="block">\tilde w_i = \beta_{0i} {\otimes} 1+\beta_{1i} {\otimes} \lambda +\cdots + \beta_{(n-1)i} {\otimes} \lambda^{n-1} = \prod_{j\neq i} (1{\otimes}\lambda - \lambda_j{\otimes} 1) </math> where <math>\beta_{ij}\in L </math> are the coefficients of the scaled Lagrange polynomial <math display="block">\frac{p(x)}{x-\lambda_i} = \prod_{j\neq i} (x - \lambda_j) = \beta_{0i} + \beta_{1i} x + \cdots + \beta_{(n-1)i} x^{n-1}.</math> ==See also== *[[Frobenius endomorphism]] *[[Cayley–Hamilton theorem]] *[[Krylov subspace]] ==Notes== {{reflist}} {{Matrix classes}} {{DEFAULTSORT:Companion Matrix}} [[Category:Matrices (mathematics)]] [[Category:Matrix theory]] [[Category:Matrix normal forms]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Block indent
(
edit
)
Template:Cite book
(
edit
)
Template:Matrix classes
(
edit
)
Template:Mvar
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)