Editing Perron–Frobenius theorem (section)

==Proof methods==
A common thread in many proofs is the [[Brouwer fixed point theorem]]. Another popular method is that of Wielandt (1950). He used the [[Lothar Collatz|Collatz]]–Wielandt formula described above to extend and clarify Frobenius's work.<ref>{{harvnb|Gantmacher|2000|p=[https://books.google.com/books?id=cyX32q8ZP5cC&dq=Applications%20of%20the%20theory%20of%20matrices&pg=PA54 section XIII.2.2 page 54]}}</ref> Another proof is based on the [[spectral theory]]<ref name="Smith"/> from which part of the arguments are borrowed.

===Perron root is strictly maximal eigenvalue for positive (and primitive) matrices===
If ''A'' is a positive (or more generally primitive) matrix, then there exists a real positive eigenvalue ''r'' (Perron–Frobenius eigenvalue or Perron root), which is strictly greater in absolute value than all other eigenvalues, hence ''r'' is the [[spectral radius]] of ''A''.

This statement does not hold for general non-negative irreducible matrices, which have ''h'' eigenvalues with the same absolute eigenvalue as ''r'', where ''h'' is the period of ''A''.

====Proof for positive matrices====
Let ''A'' be a positive matrix, assume that its spectral radius ρ(''A'') = 1 (otherwise consider ''A/ρ(A)''). Hence, there exists an eigenvalue λ on the unit circle, and all the other eigenvalues are less or equal 1 in absolute value. Suppose that another eigenvalue λ ≠ 1 also falls on the unit circle. Then there exists a positive integer ''m'' such that ''A<sup>m</sup>'' is a positive matrix and the real part of λ''<sup>m</sup>'' is negative. Let ε be half the smallest diagonal entry of ''A<sup>m</sup>'' and set ''T'' = ''A<sup>m</sup>''&nbsp;−&nbsp;''εI'' which is yet another positive matrix. Moreover, if ''Ax'' = ''λx'' then ''A<sup>m</sup>x'' = ''λ<sup>m</sup>x'' thus ''λ''<sup>''m''</sup>&nbsp;−&nbsp;''ε'' is an eigenvalue of ''T''. Because of the choice of ''m'' this point lies outside the unit disk consequently ''ρ''(''T'') > 1. On the other hand, all the entries in ''T'' are positive and less than or equal to those in ''A<sup>m</sup>'' so by [[spectral radius|Gelfand's formula]] ''ρ''(''T'') ≤ ''ρ''(''A<sup>m</sup>'') ≤ ''ρ''(''A'')<sup>''m''</sup> = 1. This contradiction means that λ=1 and there can be no other eigenvalues on the unit circle.

Absolutely the same arguments can be applied to the case of primitive matrices; we just need to mention the following simple lemma, which clarifies the properties of primitive matrices.

====Lemma====
Given a non-negative ''A'', assume there exists ''m'', such that ''A<sup>m</sup>'' is positive, then ''A''<sup>''m''+1</sup>, ''A''<sup>''m''+2</sup>, ''A''<sup>''m''+3</sup>,... are all positive.

''A''<sup>''m''+1</sup> = ''AA''<sup>''m''</sup>, so it can have zero element only if some row of ''A'' is entirely zero, but in this case the same row of ''A<sup>m</sup>'' will be zero.

Applying the same arguments as above for primitive matrices, prove the main claim.

===Power method and the positive eigenpair===
For a positive (or more generally irreducible non-negative) matrix ''A'' the dominant [[eigenvector]] is real and strictly positive (for non-negative ''A'' respectively non-negative.)

This can be established using the [[power method]], which states that for a sufficiently generic (in the sense below) matrix ''A'' the sequence of vectors  ''b''<sub>''k''+1</sub> = ''Ab''<sub>''k''</sub> / | ''Ab''<sub>''k''</sub> | converges to the [[eigenvector]] with the maximum [[eigenvalue]]. (The initial vector ''b''<sub>0</sub> can be chosen arbitrarily except for some measure zero set). Starting with a non-negative vector ''b''<sub>0</sub> produces the sequence of non-negative vectors ''b<sub>k</sub>''. Hence the limiting vector is also non-negative. By the power method this limiting vector is the dominant eigenvector for ''A'', proving the assertion. The corresponding eigenvalue is non-negative.

The proof requires two additional arguments. First, the power method converges for matrices which do not have several eigenvalues of the same absolute value as the maximal one. The previous section's argument guarantees this.

Second, to ensure strict positivity of all of the components of the eigenvector for the case of irreducible matrices. This follows from the following fact, which is of independent interest:

:Lemma: given a positive (or more generally irreducible non-negative) matrix ''A'' and ''v'' as any non-negative eigenvector for ''A'', then it<!--the eigenvector?--> is necessarily strictly positive and the corresponding eigenvalue is also strictly positive.

Proof. One of the definitions of irreducibility for non-negative matrices is that for all indexes ''i,j'' there exists ''m'', such that (''A''<sup>''m''</sup>)<sub>''ij''</sub> is strictly positive. Given a non-negative eigenvector ''v'', and that at least one of its components say ''i''-th is strictly positive, the corresponding eigenvalue is strictly positive, indeed, given ''n'' such that (''A''<sup>''n''</sup>)<sub>''ii''</sub> >0, hence:  ''r''<sup>''n''</sup>''v''<sub>''i''</sub> =
''A''<sup>''n''</sup>''v''<sub>''i''</sub> ≥
(''A''<sup>''n''</sup>)<sub>''ii''</sub>''v''<sub>''i''</sub>
>0. Hence ''r'' is strictly positive. The eigenvector is strict positivity. Then given ''m'', such that (''A''<sup>''m''</sup>)<sub>''ji''</sub> >0, hence:  ''r''<sup>''m''</sup>''v''<sub>''j''</sub> =
(''A''<sup>''m''</sup>''v'')<sub>''j''</sub> ≥
(''A''<sup>''m''</sup>)<sub>''ji''</sub>''v''<sub>''i''</sub> >0, hence
''v''<sub>''j''</sub> is strictly positive, i.e., the eigenvector is strictly positive.

===Multiplicity one===
This section proves that the Perron–Frobenius eigenvalue is a simple root of the characteristic polynomial of the matrix. Hence the eigenspace associated to Perron–Frobenius eigenvalue  ''r'' is one-dimensional. The arguments here are close to those in Meyer.<ref name="Meyer">{{harvnb|Meyer|2000|pp=[http://www.matrixanalysis.com/Chapter8.pdf chapter 8 page 665] {{cite web |url=http://www.matrixanalysis.com/Chapter8.pdf |title=Archived copy |access-date=2010-03-07 |url-status=dead |archive-url=https://web.archive.org/web/20100307021652/http://www.matrixanalysis.com/Chapter8.pdf |archive-date=March 7, 2010 }}}}</ref>

Given a strictly positive eigenvector ''v'' corresponding to ''r'' and another eigenvector ''w'' with the same eigenvalue. (The vectors ''v'' and ''w'' can be chosen to be real, because ''A'' and ''r'' are both real, so the null space of ''A-r'' has a basis consisting of real vectors.) Assuming at least one of the components of ''w'' is positive (otherwise multiply ''w'' by −1). Given  maximal possible ''α'' such that ''u=v- α w'' is non-negative, then one of the components of ''u'' is zero, otherwise ''α'' is not maximum. Vector ''u'' is an eigenvector. It is non-negative, hence by the lemma described in the [[#Power method and the positive eigenpair|previous section]] non-negativity implies strict positivity for any eigenvector. On the other hand, as above at least one component of ''u'' is zero. The contradiction implies that ''w'' does not exist.

Case: There are no Jordan blocks corresponding to the Perron–Frobenius eigenvalue ''r'' and all other eigenvalues which have the same absolute value.

If there is a Jordan block, then the [[Matrix norm#Induced norm|infinity norm]]
(A/r)<sup>k</sup><sub>∞</sub> tends to infinity for ''k → ∞ '',
but that contradicts the existence of the positive eigenvector.

Given ''r'' = 1, or ''A/r''. Letting ''v'' be a Perron–Frobenius strictly positive eigenvector, so ''Av=v'', then:

<math> \|v\|_{\infty}= \|A^k v\|_{\infty} \ge \|A^k\|_{\infty} \min_i (v_i), ~~\Rightarrow~~ \|A^k\|_{\infty} \le \|v\|/\min_i (v_i) </math>
So ‖''A<sup>k</sup>''‖<sub>∞</sub> is bounded for all ''k''. This gives another proof that there are no eigenvalues which have greater absolute value than Perron–Frobenius one. It also contradicts the existence of the Jordan block for any eigenvalue which has absolute value equal to 1 (in particular for the Perron–Frobenius one), because existence of the Jordan block implies that ‖''A<sup>k</sup>''‖<sub>∞</sub> is unbounded. For a two by two matrix:
: <math>
J^k= \begin{pmatrix} \lambda & 1 \\ 0 & \lambda \end{pmatrix} ^k
=
\begin{pmatrix} \lambda^k & k\lambda^{k-1} \\ 0 & \lambda^k \end{pmatrix},
</math>
hence ‖''J''<sup>''k''</sup>‖<sub>∞</sub> = |''k'' + ''λ''| (for |''λ''| = 1), so it tends to infinity when ''k'' does so. Since ''J<sup>k</sup>'' = ''C''<sup>−1</sup> ''A''<sup>''k''</sup>''C'', then ''A''<sup>''k''</sup> ≥ ''J''<sup>''k''</sup>/ (''C''<sup>−1</sup> ''C'' ), so it also tends to infinity. The resulting contradiction implies that there are no Jordan blocks for the corresponding eigenvalues.

Combining the two claims above reveals that the Perron–Frobenius eigenvalue ''r'' is simple root of the characteristic polynomial. In the case of nonprimitive matrices, there exist other eigenvalues which have the same absolute value as ''r''.  The same claim is true for them, but requires more work.

===No other non-negative eigenvectors===
Given positive (or more generally irreducible non-negative matrix) ''A'', the Perron–Frobenius eigenvector is the only (up to multiplication by constant) non-negative eigenvector for ''A''.

Other eigenvectors must contain negative or complex components since eigenvectors for different eigenvalues are orthogonal in some sense, but two positive eigenvectors cannot be orthogonal, so they must correspond to the same eigenvalue, but the eigenspace for the Perron–Frobenius is one-dimensional.

Assuming there exists an eigenpair (''λ'', ''y'') for ''A'', such that vector ''y'' is positive, and given (''r'', ''x''), where ''x'' – is the left Perron–Frobenius eigenvector for ''A'' (i.e. eigenvector for ''A<sup>T</sup>''), then
''rx''<sup>''T''</sup>''y'' = (''x''<sup>''T''</sup> ''A'') ''y'' = ''x''<sup>''T''</sup> (''Ay'') = ''λx''<sup>''T''</sup>''y'', also ''x''<sup>''T''</sup> ''y'' > 0, so one has: ''r'' = ''λ''. Since the eigenspace for the Perron–Frobenius eigenvalue ''r'' is one-dimensional, non-negative eigenvector ''y'' is a multiple of the Perron–Frobenius one.<ref>{{harvnb|Meyer|2000|pp=[http://www.matrixanalysis.com/Chapter8.pdf chapter 8 claim 8.2.10 page 666] {{cite web |url=http://www.matrixanalysis.com/Chapter8.pdf |title=Archived copy |access-date=2010-03-07 |url-status=dead |archive-url=https://web.archive.org/web/20100307021652/http://www.matrixanalysis.com/Chapter8.pdf |archive-date=March 7, 2010 }}}}</ref>

===Collatz–Wielandt formula===
Given a positive (or more generally irreducible non-negative matrix) ''A'', one defines 
the function ''f'' on the set of all non-negative non-zero vectors ''x'' such that ''f(x)'' is the minimum value of [''Ax'']<sub>''i''</sub> / ''x''<sub>''i''</sub> taken over all those ''i'' such that ''x<sub>i</sub>'' ≠ 0. Then ''f'' is a real-valued function, whose [[maximum]] is the Perron–Frobenius eigenvalue ''r''.

For the proof we denote the maximum of ''f'' by the value ''R''. The proof requires to show '' R = r''. Inserting the Perron-Frobenius eigenvector ''v'' into ''f'', we obtain ''f(v) = r'' and conclude ''r ≤ R''. For the opposite inequality, we consider an arbitrary nonnegative vector ''x'' and let ''ξ=f(x)''. The definition of ''f'' gives ''0 ≤ ξx ≤ Ax'' (componentwise). Now, we use the positive right eigenvector ''w'' for ''A'' for the Perron-Frobenius eigenvalue ''r'', then '' ξ w<sup>T</sup> x = w<sup>T</sup> ξx ≤ w<sup>T</sup> (Ax) = (w<sup>T</sup> A)x = r w<sup>T</sup> x ''. Hence ''f(x) = ξ ≤ r'', which implies 
''R ≤ r''.<ref>{{harvnb|Meyer|2000|pp=[http://www.matrixanalysis.com/Chapter8.pdf chapter 8 page 666] {{cite web |url=http://www.matrixanalysis.com/Chapter8.pdf |title=Archived copy |access-date=2010-03-07 |url-status=dead |archive-url=https://web.archive.org/web/20100307021652/http://www.matrixanalysis.com/Chapter8.pdf |archive-date=March 7, 2010 }}}}</ref>

===Perron projection as a limit: ''A''<sup>''k''</sup>/''r''<sup>''k''</sup>===
Let ''A'' be a positive (or more generally, primitive) matrix, and let ''r'' be its Perron–Frobenius eigenvalue.
# There exists a limit ''A<sup>k</sup>/r<sup>k</sup>'' for ''k → ∞'', denote it by ''P''.
# ''P'' is a [[Projection (linear algebra)|projection operator]]: ''P''<sup>2</sup> = ''P'', which commutes with ''A'': ''AP'' = ''PA''.
# The image of ''P'' is one-dimensional and spanned by the Perron–Frobenius eigenvector ''v'' (respectively for ''P<sup>T</sup>''—by the Perron–Frobenius eigenvector ''w'' for ''A<sup>T</sup>'').
# ''P'' = ''vw''<sup>''T''</sup>, where ''v,w'' are normalized such that ''w''<sup>''T''</sup> ''v'' = 1.
# Hence ''P'' is a positive operator.

Hence ''P'' is a [[spectral projection]] for the Perron–Frobenius eigenvalue ''r'', and is called the Perron projection. The above assertion is not true for general non-negative irreducible matrices.

Actually the claims above (except claim 5) are valid for any matrix ''M'' such that there exists an eigenvalue ''r'' which is strictly greater than the other eigenvalues in absolute value and is the simple root of the characteristic [[polynomial]]. (These requirements hold for primitive matrices as above).

Given that ''M'' is diagonalizable, ''M'' is conjugate to a diagonal matrix with eigenvalues ''r''<sub>1</sub>, ... , ''r''<sub>''n''</sub> on the diagonal (denote ''r''<sub>1</sub> = ''r''). The matrix ''M''<sup>''k''</sup>/''r''<sup>''k''</sup> will be conjugate (1, (''r''<sub>2</sub>/''r'')<sup>''k''</sup>, ... , (''r''<sub>''n''</sub>/''r'')<sup>''k''</sup>), which tends to (1,0,0,...,0), for ''k → ∞'', so the limit exists. The same method works for general ''M'' (without assuming that ''M'' is diagonalizable).

The projection and commutativity properties are elementary corollaries of the definition: ''MM''<sup>''k''</sup>/''r''<sup>''k''</sup> =  ''M''<sup>''k''</sup>/''r''<sup>''k''</sup> ''M'' ; ''P''<sup>2</sup> = lim ''M''<sup>2''k''</sup>/''r''<sup>2''k''</sup> = ''P''. The third fact is also elementary: ''M''(''Pu'') = ''M'' lim ''M''<sup>''k''</sup>/''r''<sup>''k''</sup> ''u'' = lim ''rM''<sup>''k''+1</sup>/''r''<sup>''k''+1</sup>''u'', so taking the limit yields ''M''(''Pu'') = ''r''(''Pu''), so image of ''P'' lies in the ''r''-eigenspace for ''M'', which is one-dimensional by the assumptions.

Denoting by ''v'', ''r''-eigenvector for ''M'' (by ''w'' for ''M<sup>T</sup>''). Columns of ''P'' are multiples of ''v'', because the image of ''P'' is spanned by it. Respectively, rows of ''w''. So ''P'' takes a form ''(a v w<sup>T</sup>)'', for some ''a''. Hence its trace equals to ''(a w<sup>T</sup> v)''. Trace of projector equals the dimension of its image. It was proved before that it is not more than one-dimensional. From the definition one sees that ''P'' acts identically on the ''r''-eigenvector for ''M''. So it is one-dimensional. So choosing (''w''<sup>''T''</sup>''v'') = 1, implies ''P'' = ''vw''<sup>''T''</sup>.

===Inequalities for Perron–Frobenius eigenvalue===
For any non-negative matrix ''A'' its Perron–Frobenius eigenvalue ''r'' satisfies the inequality:

:<math> r \; \le \; \max_i \sum_j a_{ij}.</math>

This is not specific to non-negative matrices: for any matrix ''A'' with an eigenvalue <math>\scriptstyle\lambda</math> it is true
that <math>\scriptstyle |\lambda| \; \le \; \max_i \sum_j |a_{ij}|</math>. This is an immediate corollary of the
[[Gershgorin circle theorem]]. However another proof is more direct:

Any [[Matrix norm#Induced norm|matrix induced norm]] satisfies the inequality <math>\scriptstyle\|A\| \ge |\lambda|</math> for any eigenvalue <math>\scriptstyle\lambda</math> because, if <math>\scriptstyle x</math> is a corresponding eigenvector, <math>\scriptstyle\|A\| \ge |Ax|/|x| = |\lambda x|/|x| = |\lambda|</math>. The [[Matrix norm#Induced norm|infinity norm]] of a matrix is the maximum of row sums: <math>\scriptstyle \left \| A \right \| _\infty = \max \limits _{1 \leq i \leq m} \sum _{j=1} ^n | a_{ij} |. </math> Hence the desired inequality is exactly  <math>\scriptstyle\|A\|_\infty \ge |\lambda|</math> applied to the non-negative matrix ''A''.

Another inequality is:

:<math>\min_i \sum_j a_{ij} \; \le \; r .</math>

This fact is specific to non-negative matrices; for general matrices there is nothing similar. Given  that ''A'' is positive (not just non-negative), then there exists a positive eigenvector ''w'' such that ''Aw'' = ''rw'' and the smallest component of ''w'' (say ''w<sub>i</sub>'') is 1. Then ''r'' = (''Aw'')<sub>''i''</sub> ≥ the sum of the numbers in row ''i'' of ''A''. Thus the minimum row sum gives a lower bound for ''r'' and this observation can be extended to all non-negative matrices by continuity.

Another way to argue it is via the [[Lothar Collatz|Collatz]]-Wielandt formula. One takes the vector ''x''&nbsp;=&nbsp;(1,&nbsp;1,&nbsp;...,&nbsp;1) and immediately obtains the inequality.

===Further proofs===

====Perron projection====
The proof now proceeds using [[Eigendecomposition of a matrix|spectral decomposition]]. The trick here is to split the Perron root from the other eigenvalues. The spectral projection associated with the Perron root is called the Perron projection and it enjoys the following property:

The Perron projection of an irreducible non-negative square matrix is a positive matrix.

Perron's findings and also (1)–(5) of the theorem are corollaries of this result. The key point is that a positive projection always has rank one. This means that if ''A'' is an irreducible non-negative square matrix then the algebraic and geometric multiplicities of its Perron root are both one. Also if ''P'' is its Perron projection then ''AP'' = ''PA'' = ρ(''A'')''P'' so every column of ''P'' is a positive right eigenvector of ''A'' and every row is a positive left eigenvector. Moreover, if ''Ax'' = λ''x'' then ''PAx'' = λ''Px'' = ρ(''A'')''Px'' which means ''Px'' = 0 if λ ≠ ρ(''A''). Thus the only positive eigenvectors are those associated with ρ(''A''). If ''A'' is a primitive matrix with ρ(''A'') = 1 then it can be decomposed as ''P'' ⊕ (1&nbsp;−&nbsp;''P'')''A'' so that ''A<sup>n</sup>'' = ''P'' + (1&nbsp;−&nbsp;''P'')''A''<sup>''n''</sup>. As ''n'' increases the second of these terms decays to zero leaving ''P'' as the limit of ''A<sup>n</sup>'' as ''n''&nbsp;→&nbsp;∞.

The power method is a convenient way to compute the Perron projection of a primitive matrix. If ''v'' and ''w'' are the positive row and column vectors that it generates then the Perron projection is just ''wv''/''vw''. The spectral projections aren't neatly blocked as in the Jordan form. Here they are overlaid and each generally has complex entries extending to all four corners of the square matrix. Nevertheless, they retain their mutual orthogonality which is what facilitates the decomposition.

====Peripheral projection====
The analysis when ''A'' is irreducible and non-negative is broadly similar. The Perron projection is still positive but there may now be other eigenvalues of modulus ρ(''A'') that negate use of the power method and prevent the powers of (1&nbsp;−&nbsp;''P'')''A'' decaying as in the primitive case whenever ρ(''A'') = 1. So we consider the '''peripheral projection''', which is the spectral projection of ''A'' corresponding to all the eigenvalues that have modulus ''ρ''(''A''). It may then be shown that the peripheral projection of an irreducible non-negative square matrix is a non-negative matrix with a positive diagonal.

====Cyclicity====
Suppose in addition that ρ(''A'') = 1 and ''A'' has ''h'' eigenvalues on the unit circle. If ''P'' is the peripheral projection then the matrix ''R'' = ''AP'' = ''PA'' is non-negative and irreducible, ''R<sup>h</sup>'' = ''P'', and the cyclic group ''P'', ''R'', ''R''<sup>2</sup>, ...., ''R''<sup>''h''−1</sup> represents the harmonics of ''A''. The spectral projection of ''A'' at the eigenvalue λ on the unit circle is given by the formula <math>\scriptstyle h^{-1}\sum^h_1\lambda^{-k}R^k</math>. All of these projections (including the Perron projection) have the same positive diagonal, moreover choosing any one of them and then taking the modulus of every entry invariably yields the Perron projection. Some donkey work is still needed in order to establish the cyclic properties (6)–(8) but it's essentially just a matter of turning the handle. The spectral decomposition of ''A'' is given by ''A''&nbsp;=&nbsp;''R''&nbsp;⊕&nbsp;(1&nbsp;−&nbsp;''P'')''A'' so the difference between ''A<sup>n</sup>'' and ''R<sup>n</sup>'' is ''A<sup>n</sup>''&nbsp;−&nbsp;''R<sup>n</sup>'' =&nbsp;(1&nbsp;−&nbsp;''P'')''A''<sup>''n''</sup> representing the transients of ''A<sup>n</sup>'' which eventually decay to zero. ''P'' may be computed as the limit of ''A<sup>nh</sup>'' as ''n''&nbsp;→&nbsp;∞.