Editing Stochastic matrix (section)

==Definition and properties==
A stochastic matrix describes a [[Markov chain]] {{math|'''''X'''''<sub>''t''</sub>}} over a [[finite set|finite]] [[Probability space|state space]] {{mvar|S}} with [[cardinality]] {{mvar|α}}.

If the [[probability]] of moving from {{mvar|i}} to {{mvar|j}} in one time step is {{math|1=Pr(''j''{{!}}''i'') = ''P''<sub>''i'',''j''</sub>}}, the stochastic matrix {{mvar|P}} is given by using {{math|''P''<sub>''i'',''j''</sub>}} as the {{mvar|i}}-th row and {{mvar|j}}-th column element, e.g.,

<math display="block">P=\left[\begin{matrix}
P_{1,1}&P_{1,2}&\dots&P_{1,j}&\dots&P_{1,\alpha}\\
P_{2,1}&P_{2,2}&\dots&P_{2,j}&\dots&P_{2,\alpha}\\
\vdots&\vdots&\ddots&\vdots&\ddots&\vdots\\
P_{i,1}&P_{i,2}&\dots&P_{i,j}&\dots&P_{i,\alpha}\\
\vdots&\vdots&\ddots&\vdots&\ddots&\vdots\\
P_{\alpha,1}&P_{\alpha,2}&\dots&P_{\alpha,j}&\dots&P_{\alpha,\alpha}\\
\end{matrix}\right].</math>

Since the total of transition probability from a state {{mvar|i}} to all other states must be 1,
<math display="block">\forall i \in \{1, \ldots, \alpha\},\quad \sum_{j=1}^\alpha P_{i,j}=1;\,</math>
thus this matrix is a right stochastic matrix.

The above elementwise sum across each row {{mvar|i}} of {{mvar|P}} may be more concisely written as {{math|1=''P'''''1''' = '''1'''}}, where {{math|'''1'''}} is the {{mvar|α}}-dimensional column vector of all ones. Using this, it can be seen that the product of two right stochastic matrices {{math|''P''′}} and {{math|''P''′′}} is also right stochastic: {{math|1=''P''′ ''P''′′ '''1''' = ''P''′ (''P''′′ '''1''') = ''P''′ '''1''' = '''1'''}}. In general, the {{mvar|k}}-th power {{math|''P<sup>k</sup>''}} of a right stochastic matrix {{mvar|P}} is also right stochastic.  The probability of transitioning from {{mvar|i}} to {{mvar|j}} in two steps is then given by the {{math|(''i'', ''j'')}}-th element of the square of {{mvar|P}}:

<math display="block">\left(P ^{2}\right)_{i,j}.</math>

In general, the probability transition of going from any state to another state in a finite Markov chain given by the matrix {{mvar|P}} in {{mvar|k}} steps is given by {{math|''P<sup>k</sup>''}}.

An initial probability distribution of states, specifying where the system might be initially and with what probabilities, is given as a [[row vector]].

A ''stationary'' [[probability vector]] {{mvar|'''π'''}} is defined as a distribution, written as a row vector, that does not change under application of the transition matrix; that is, it is defined as a probability distribution on the set {{math|{1, …, ''n''{{)}}}} which is also a [[left eigenvector]] of the probability matrix, associated with [[eigenvalue]] 1:

<math display="block">\boldsymbol{\pi}P=\boldsymbol{\pi}.</math>

[[File:Karpelevich regions.svg|thumb|right|Karpelevič regions for ''n'' = 3 and ''n'' = 4.]]
It can be shown that the [[spectral radius]] of any stochastic matrix is one. By the [[Gershgorin circle theorem]], all of the eigenvalues of a stochastic matrix have absolute values less than or equal to one. More precisely, the eigenvalues of <math>n</math>-by-<math>n</math> stochastic matrices are restricted to lie within a subset of the complex unit disk, known as Karpelevič regions.<ref>{{cite journal |last1=Munger |first1=Devon |last2=Nickerson |first2=Andrew |last3=Paparella |first3=Pietro |title=Demystifying the Karpelevič theorem |journal=Linear Algebra and Its Applications |date=2024 |volume=702 |pages=46–62|doi=10.1016/j.laa.2024.08.006 |arxiv=2309.03849 }}</ref> This result was originally obtained by [[Fridrikh Karpelevich]],<ref>{{cite journal |last1=Karpelevič. |first1=Fridrikh |title=On the characteristic roots of matrices with nonnegative elements. |journal=Izv. Math. |date=1951 |volume=15 |issue=4}}</ref> following a question originally posed by Kolmogorov<ref>{{cite journal |last1=Kolmogorov |first1=Andrei |title=Markov chains with a countable number of possible states |journal=Bull. Mosk. Gos. Univ. Math. Mekh |date=1937 |volume=1 |issue=3 |pages=1–15}}</ref> and partially addressed by [[Nikolay Dmitriyev]] and [[Eugene Dynkin]].<ref>{{cite journal |last1=Dmitriev |first1=Nikolai |last2=Dynkin |first2=Eugene |title=On characteristic roots of stochastic matrices |journal=Izvestiya Rossiiskoi Akademii Nauk. Seriya Matematicheskaya |date=1946 |volume=10 |issue=2 |pages=167–184}}</ref>

Additionally, every right stochastic matrix has an "obvious" column eigenvector associated to the eigenvalue 1: the vector {{math|'''1'''}} used above, whose coordinates are all equal to 1. As left and right eigenvalues of a square matrix are the same, every stochastic matrix has, at least, a [[left eigenvector]] associated to the [[eigenvalue]] 1 and the largest absolute value of all its eigenvalues is also 1. Finally, the [[Brouwer Fixed Point Theorem]] (applied to the compact convex set of all probability distributions of the finite set {{math|{1, ..., ''n''{{)}}}}) implies that there is some left eigenvector which is also a stationary probability vector.

On the other hand, the [[Perron–Frobenius theorem]] also ensures that every [[Irreducibility (mathematics)|irreducible]] stochastic matrix has such a stationary vector, and that the largest absolute value of an eigenvalue is always 1. However, this theorem cannot be applied directly to such matrices because they need not be irreducible. In general, there may be several such vectors.  However, for a matrix with strictly positive entries (or, more generally, for an irreducible aperiodic stochastic matrix), this vector is unique and can be computed by observing that for any {{mvar|i}} we have the following limit,

<math display="block">\lim_{k\rightarrow\infty}\left(P^k \right)_{i,j}=\boldsymbol{\pi}_j,</math>

where {{math|'' '''π'''<sub>j</sub>''}} is the {{mvar|j}}-th element of the row vector {{mvar|'''π'''}}. Among other things, this says that the long-term probability of being in a state {{mvar|j}} is independent of the initial state {{mvar|i}}. That both of these computations give the same stationary vector is a form of an [[ergodic theorem]], which is generally true in a wide variety of [[dissipative dynamical system]]s: the system evolves, over time, to a [[stationary state]].


Intuitively, a stochastic matrix represents a Markov chain; the application of the stochastic matrix to a probability distribution redistributes the probability mass of the original distribution while preserving its total mass. If this process is applied repeatedly, the distribution converges to a stationary distribution for the Markov chain.<ref name=":1" />{{Rp|pages=14-17}}<ref name="Kardar2007">{{cite book|first=Mehran |last=Kardar |author-link=Mehran Kardar |title=Statistical Physics of Fields |title-link=Statistical Physics of Particles |year=2007 |publisher=[[Cambridge University Press]] |isbn=978-0-521-87341-3 |oclc=920137477}}</ref>{{Rp|page=116}}

Stochastic matrices and their product form a [[category (mathematics)|category]], which is both a subcategory of the [[category of matrices]] and of the one of [[Markov kernel]]s.