Editing Permutation matrix

{{Short description|Matrix with exactly one 1 per row and column}}
{{Use American English|date = January 2019}}

In [[mathematics]], particularly in [[Matrix (mathematics)|matrix theory]], a '''permutation matrix''' is a square [[binary matrix]] that has exactly one entry of 1 in each row and each column with all other entries 0.<ref name="Artin Algebra">{{cite book |author1-link=Michael Artin |last1=Artin |first1=Michael |date=1991 |title=Algebra  |publisher=Prentice Hall |pages=24–26,118,259,322 |isbn=0-13-004763-5 |oclc=24364036}}</ref>{{rp|page=26}} An {{math|''n'' × ''n''}} permutation matrix can represent a [[permutation]] of {{mvar|n}} elements. Pre-[[matrix multiplication|multiplying]] an {{mvar|n}}-row matrix {{mvar|M}} by a permutation matrix {{mvar|P}}, forming {{mvar|PM}}, results in permuting the rows of {{mvar|M}}, while post-multiplying an {{mvar|n}}-column matrix {{mvar|M}}, forming {{mvar|MP}}, permutes the columns of {{mvar|M}}.

Every permutation matrix ''P'' is [[orthogonal matrix|orthogonal]], with its [[invertible matrix|inverse]] equal to its [[transpose]]: <math>P^{-1}=P^\mathsf{T}</math>.<ref name="Artin Algebra" />{{rp|page=26}}  Indeed, permutation matrices can be [[Characterization (mathematics)|characterized]] as the orthogonal matrices whose entries are all [[non-negative]].<ref>{{cite journal |last1=Zavlanos |first1=Michael M. |last2=Pappas |first2=George J. |date=November 2008 |title=A dynamical systems approach to weighted graph matching |url=https://www.sciencedirect.com/science/article/abs/pii/S0005109808002616 |journal=Automatica |volume=44 |issue=11 |pages=2817–2824 |doi=10.1016/j.automatica.2008.04.009 |s2cid=834305 |access-date=21 August 2022 |quote=Let <math>O_n</math> denote the set of <math>n \times n</math> orthogonal matrices and <math>N_n</math> denote the set of <math>n \times n</math> element-wise non-negative matrices. Then, <math>P_n = O_n \cap N_n</math>, where <math>P_n</math> is the set of <math>n \times n</math> permutation matrices.|citeseerx=10.1.1.128.6870 }}</ref>
== The two permutation/matrix correspondences ==

There are two natural one-to-one correspondences between permutations and permutation matrices, one of which works along the rows of the matrix, the other along its columns.  Here is an example, starting with a permutation {{pi}} in two-line form at the upper left:
:<math>\begin{matrix} 
\pi\colon\begin{pmatrix}1&2&3&4\\3&2&4&1\end{pmatrix}
& \longleftrightarrow &
R_\pi\colon\begin{pmatrix}
0&0&1&0\\
0&1&0&0\\
0&0&0&1\\
1&0&0&0\end{pmatrix}\\[5pt]
\Big\updownarrow && \Big\updownarrow\\[5pt]
C_\pi\colon\begin{pmatrix}
0&0&0&1\\
0&1&0&0\\
1&0&0&0\\
0&0&1&0\end{pmatrix}
& \longleftrightarrow &
\pi^{-1}\colon\begin{pmatrix}1&2&3&4\\4&2&1&3\end{pmatrix}\end{matrix}</math>
The row-based correspondence takes the permutation {{pi}} to the matrix <math>R_\pi</math> at the upper right.  The first row of <math>R_\pi</math> has its 1 in the third column because <math>\pi(1)=3</math>.  More generally, we have <math>R_\pi=(r_{ij})</math> where <math>r_{ij}=1</math> when <math>j=\pi(i)</math> and <math>r_{ij}=0</math> otherwise.

The column-based correspondence takes {{pi}} to the matrix <math>C_\pi</math> at the lower left.  The first column of <math>C_\pi</math> has its 1 in the third row because <math>\pi(1)=3</math>.  More generally, we have <math>C_\pi=(c_{ij})</math> where <math>c_{ij}</math> is 1 when <math>i=\pi(j)</math> and 0 otherwise.  Since the two recipes differ only by swapping ''i'' with ''j'', the matrix <math>C_\pi</math> is the transpose of <math>R_\pi</math>; and, since <math>R_\pi</math> is a permutation matrix, we have <math>C_\pi=R_\pi^\mathsf{T}=R_\pi^{-1}</math>.  Tracing the other two sides of the big square, we have <math>R_{\pi^{-1}}=C_\pi=R_\pi^{-1}</math> and <math>C_{\pi^{-1}}=R_\pi</math>.<ref>This terminology is not standard. Most authors use just one of the two correspondences, choosing which to be consistent with their other conventions.  For example, Artin uses the column-based correspondence.  We have here invented two names in order to discuss both options.</ref>

== Permutation matrices permute rows or columns ==
Multiplying a matrix ''M'' by either <math>R_\pi</math> or <math>C_\pi</math> on either the left or the right will permute either the rows or columns of ''M'' by either {{pi}} or {{pi}}<sup>−1</sup>.  The details are a bit tricky.

To begin with, when we permute the entries of a vector <math>(v_1,\ldots,v_n)</math> by some permutation {{pi}}, we move the <math>i^\text{th}</math> entry <math>v_i</math> of the input vector into the <math>\pi(i)^\text{th}</math> slot of the output vector.  Which entry then ends up in, say, the first slot of the output?  Answer: The entry <math>v_j</math> for which <math>\pi(j)=1</math>, and hence <math>j=\pi^{-1}(1)</math>.  Arguing similarly about each of the slots, we find that the output vector is
:<math> \big(v_{\pi^{-1}(1)},v_{\pi^{-1}(2)},\ldots,v_{\pi^{-1}(n)}\big),</math>
even though we are permuting by <math>\pi</math>, not by <math>\pi^{-1}</math>. Thus, in order to permute the entries by <math>\pi</math>, we must permute the indices by <math>\pi^{-1}</math>.<ref name="Artin Algebra" />{{rp|page=25}}  (Permuting the entries by <math>\pi</math> is sometimes called taking the ''alibi viewpoint'', while permuting the indices by <math>\pi</math> would take the ''alias viewpoint''.<ref>{{cite book |author1-link=John H. Conway |last1=Conway |first1=John H. |last2=Burgiel |first2=Heidi |author3-link=Chaim Goodman-Strauss |last3=Goodman-Strauss |first3=Chaim |date=2008 |title=The Symmetries of Things |publisher=A K Peters/CRC Press |doi=10.1201/b21368 |isbn=978-0-429-06306-0 |oclc=946786108 |page=179 |quote=A permutation—say, of the names of a number of people—can be thought of as moving either the names or the people.  The alias viewpoint regards the permutation as assigning a new name or ''alias'' to each person (from the Latin ''alias'' = otherwise).  Alternatively, from the alibi viewoint we move the people to the places corresponding to their new names (from the Latin ''alibi'' = in another place.) }}</ref>)

Now, suppose that we pre-multiply some ''n''-row matrix <math>M=(m_{i,j})</math> by the permutation matrix <math>C_\pi</math>.  By the rule for [[matrix multiplication]], the <math>(i,j)^\text{th}</math> entry in the product <math>C_\pi M</math> is
:<math display=block>\sum_{k=1}^n c_{i,k}m_{k,j},</math>
where <math>c_{i,k}</math> is 0 except when <math>i=\pi(k)</math>, when it is 1.  Thus, the only term in the sum that survives is the term in which <math>k=\pi^{-1}(i)</math>, and the sum reduces to <math>m_{\pi^{-1}(i),j}</math>.  Since we have permuted the row index by <math>\pi^{-1}</math>, we have permuted the rows of ''M'' themselves by {{pi}}.<ref name="Artin Algebra" />{{rp|page=25}}  A similar argument shows that post-multiplying an ''n''-column matrix ''M'' by <math>R_\pi</math> permutes its columns by {{pi}}.

The other two options are pre-multiplying by <math>R_\pi</math> or post-multiplying by <math>C_\pi</math>, and they permute the rows or columns respectively by {{pi}}<sup>−1</sup>, instead of by {{pi}}.

== The transpose is also the inverse ==
A related argument proves that, as we claimed above, the transpose of any permutation matrix ''P'' also acts as its inverse, which implies that ''P'' is invertible. (Artin leaves that proof as an exercise,<ref name="Artin Algebra" />{{rp|page=26}} which we here solve.)  If <math>P=(p_{i,j})</math>, then the <math>(i,j)^\text{th}</math> entry of its transpose <math>P^\mathsf{T}</math> is <math>p_{j,i}</math>.  The <math>(i,j)^\text{th}</math> entry of the product <math>PP^\mathsf{T}</math> is then
:<math display=block>\sum_{k=1}^n p_{i,k}p_{j,k}.</math>
Whenever <math>i\ne j</math>, the <math>k^\text{th}</math> term in this sum is the product of two different entries in the <math>k^\text{th}</math> column of ''P''; so all terms are 0, and the sum is 0.  When <math>i=j</math>, we are summing the squares of the entries in the <math>i^\text{th}</math> row of ''P'', so the sum is 1.  The product <math>PP^\mathsf{T}</math> is thus the identity matrix.  A symmetric argument shows the same for <math>P^\mathsf{T}P</math>, implying that ''P'' is invertible with <math>P^{-1}=P^\mathsf{T}</math>.

== Multiplying permutation matrices ==
Given two permutations of {{math|''n''}} elements {{sigma}} and {{tau}}, the product of the corresponding column-based permutation matrices {{math|''C''<sub>σ</sub>}} and {{math|''C''<sub>τ</sub>}} is given,<ref name="Artin Algebra" />{{rp|page=25}} as you might expect, by
<math display=block>C_{\sigma} C_{\tau} = C_{\sigma\,\circ\,\tau}, </math>
where the composed permutation <math>\sigma\circ\tau</math> applies first {{tau}} and then {{sigma}}, working from right to left:
<math display=block>(\sigma\circ\tau) (k) = \sigma \left(\tau (k) \right).</math>
This follows because pre-multiplying some matrix by {{math|''C''<sub>τ</sub>}} and then pre-multiplying the resulting product by {{math|''C''<sub>σ</sub>}} gives the same result as pre-multiplying just once by the combined <math>C_{\sigma\,\circ\,\tau}</math>.

For the row-based matrices, there is a twist:  The product of {{math|''R''<sub>σ</sub>}} and {{math|''R''<sub>τ</sub>}} is given by
:<math>R_{\sigma} R_{\tau} = R_{\tau\,\circ\,\sigma}, </math>
with {{sigma}} applied before {{tau}} in the composed permutation.  This happens because we must post-multiply to avoid inversions under the row-based option, so we would post-multiply first by {{math|''R''<sub>σ</sub>}} and then by {{math|''R''<sub>τ</sub>}}.

Some people, when applying a function to an argument, write the function after the argument ([[Reverse Polish notation|postfix notation]]), rather than before it.  When doing linear algebra, they work with linear spaces of row vectors, and they apply a linear map to an argument by using the map's matrix to post-multiply the argument's row vector.  They often use a left-to-right composition operator, which we here denote using a semicolon; so the composition <math> \sigma\,;\,\tau</math> is defined either by
:<math>(\sigma\,;\,\tau)(k) = \tau\left(\sigma(k)\right),</math>
or, more elegantly, by
:<math>(k)(\sigma\,;\,\tau) = \left((k)\sigma \right)\tau,</math>
with {{sigma}} applied first.  That notation gives us a simpler rule for multiplying row-based permutation matrices:
:<math>R_{\sigma} R_{\tau} = R_{\sigma\,;\,\tau}. </math>

== Matrix group ==
When {{pi}} is the identity permutation, which has <math>\pi(i)=i</math> for all ''i'', both {{math|''C''<sub>{{pi}}</sub>}} and {{math|''R''<sub>{{pi}}</sub>}} are the [[identity matrix]].

There are {{math|''n''!}} permutation matrices, since there are {{math|''n''!}} permutations and the map <math>C\colon\pi\mapsto C_\pi</math> is a one-to-one correspondence between permutations and permutation matrices.  (The map <math>R</math> is another such correspondence.)  By the formulas above, those {{math|''n'' × ''n''}} permutation matrices form a [[Group (mathematics)|group]] of order {{math|''n''!}} under matrix multiplication, with the identity matrix as its [[identity element]], a group that we denote <math>\mathcal{P}_n</math>.  The group <math>\mathcal{P}_n</math> is a subgroup of the [[general linear group]] <math>GL_n(\mathbb{R})</math> of invertible {{math|''n'' × ''n''}} matrices of real numbers.  Indeed, for any [[Field (mathematics)|field]] ''F'', the group  <math>\mathcal{P}_n</math> is also a subgroup of the group <math>GL_n(F)</math>, where the matrix entries belong to ''F''. (Every field contains 0 and 1 with <math>0+0=0,</math> <math>0+1=1,</math> <math>0*0=0,</math> <math>0*1=0,</math> and <math>1*1=1;</math> and that's all we need to multiply permutation matrices.  Different fields disagree about whether <math>1+1=0</math>, but that sum doesn't arise.)

Let <math>S_n^\leftarrow</math> denote the [[symmetric group]], or [[permutation group|group of permutations]], on {1,2,...,{{math|''n''}}} where the group operation is the standard, right-to-left composition "<math>\circ</math>"; and let <math>S_n^\rightarrow</math> denote the [[opposite group]], which uses the left-to-right composition "<math>\,;\,</math>".  The map <math>C\colon S_n^\leftarrow\to GL_n(\mathbb{R})</math> that takes {{pi}} to its column-based matrix <math>C_\pi</math> is a [[faithful representation]], and similarly for the map <math>R\colon S_n^\rightarrow\to GL_n(\mathbb{R})</math> that takes {{pi}} to <math>R_\pi</math>.

==Doubly stochastic matrices==

Every permutation matrix is [[doubly stochastic matrix|doubly stochastic]].  The set of all doubly stochastic matrices is called the [[Birkhoff polytope]], and the permutation matrices play a special role in that polytope.  The [[Birkhoff–von Neumann theorem]] says that every doubly stochastic real matrix is a [[convex combination]] of permutation matrices of the same order, with the permutation matrices being precisely the [[extreme point]]s (the vertices) of the Birkhoff polytope. The Birkhoff polytope is thus the [[convex hull]] of the permutation matrices.<ref name=Bru19>{{harvnb|Brualdi|2006|p=19}}</ref>

==Linear-algebraic properties==

Just as each permutation is associated with two permutation matrices, each permutation matrix is associated with two permutations, as we can see by relabeling the example in the big square above starting with the matrix ''P'' at the upper right:
:<math>\begin{matrix} 
\rho_P\colon\begin{pmatrix}1&2&3&4\\3&2&4&1\end{pmatrix}
& \longleftrightarrow &
P\colon\begin{pmatrix}
0&0&1&0\\
0&1&0&0\\
0&0&0&1\\
1&0&0&0\end{pmatrix}\\[5pt]
\Big\updownarrow && \Big\updownarrow\\[5pt]
P^{-1}\colon\begin{pmatrix}
0&0&0&1\\
0&1&0&0\\
1&0&0&0\\
0&0&1&0\end{pmatrix}
& \longleftrightarrow &
\kappa_P\colon\begin{pmatrix}1&2&3&4\\4&2&1&3\end{pmatrix}\end{matrix}</math>
So we are here denoting the inverse of ''C'' as <math>\kappa</math> and the inverse of ''R'' as <math>\rho</math>.  We can then compute the linear-algebraic properties of ''P'' from some combinatorial properties that are shared by the two permutations <math>\kappa_P</math> and <math>\rho_P=\kappa_P^{-1}</math>.

A point is [[Fixed point (mathematics)|fixed]] by <math>\kappa_P</math> just when it is fixed by <math>\rho_P</math>, and the [[Trace (linear algebra)|trace]] of ''P'' is the number of such shared fixed points.<ref name="Artin Algebra" />{{rp|page=322}}  If the integer ''k'' is one of them, then the [[standard basis]] vector {{math|'''''e'''''<sub>''k''</sub>}} is an [[eigenvector]] of ''P''.<ref name="Artin Algebra" />{{rp|page=118}}

To calculate the complex [[eigenvalue]]s of ''P'', write the permutation <math>\kappa_P</math> as a composition of [[cyclic permutation|disjoint cycles]], say <math>\kappa_P= c_{1}c_{2} \cdots c_{t}</math>.  (Permutations of disjoint subsets commute, so it doesn't matter here whether we are composing right-to-left or left-to-right.) For <math>1 \le i \le t</math>, let the length of the cycle <math>c_i</math> be <math>\ell_i</math>, and let <math>L_{i}</math> be the set of complex solutions of <math>x^{\ell_{i}}=1</math>, those solutions being the <math>\ell_i^{\,\text{th}}</math> [[root of unity|roots of unity]]. The [[multiset]] union of the <math>L_{i}</math> is then the multiset of eigenvalues of ''P''.  Since writing <math>\rho_P</math> as a product of cycles would give the same number of cycles of the same lengths, analyzing <math>\rho_p</math> would give the same result.  The [[Eigenvalues and eigenvectors#Eigenvalues and eigenvectors of matrices|multiplicity]] of any eigenvalue ''v'' is the number of ''i'' for which <math>L_i</math> contains ''v''.<ref name=J_Najnudel2010_4>{{harvnb|Najnudel|Nikeghbali|2013|p=4}}</ref> (Since any permutation matrix is [[Normal matrices|normal]] and any normal matrix is [[diagonalizable matrix|diagonalizable]] over the complex numbers,<ref name="Artin Algebra" />{{rp|page=259}} the algebraic and geometric multiplicities of an eigenvalue ''v'' are the same.)
	
From [[group theory]] we know that any permutation may be written as a composition of [[transposition (mathematics)|transposition]]s. Therefore, any permutation matrix factors as a product of row-switching [[elementary matrix|elementary matrices]], each of which has [[determinant]] &minus;1. Thus, the determinant of the permutation matrix ''P'' is the [[parity of a permutation|sign]] of the permutation <math>\kappa_P</math>, which is also the sign of <math>\rho_P</math>.

== Restricted forms ==

* [[Costas array]], a permutation matrix in which the displacement vectors between the entries are all distinct
* [[Eight queens puzzle|n-queens puzzle]], a permutation matrix in which there is at most one entry in each diagonal and antidiagonal

== See also ==
* [[Alternating sign matrix]]
* [[Exchange matrix]]
* [[Generalized permutation matrix]]
*[[Rook polynomial]]
*[[Permanent (mathematics)|Permanent]]

==References==
{{Reflist}}
{{refbegin}}
* {{cite book | last=Brualdi | first=Richard A. | title=Combinatorial matrix classes | series=Encyclopedia of Mathematics and Its Applications | volume=108 | location=Cambridge | publisher=[[Cambridge University Press]] | year=2006 | isbn=0-521-86565-4 | zbl=1106.05001 | url-access=registration | url=https://archive.org/details/combinatorialmat0000brua }}
*{{Citation
| first1=Joseph
| last1=Najnudel
| first2=Ashkan
| last2=Nikeghbali|author2-link=Ashkan Nikeghbali
| title=The Distribution of Eigenvalues of Randomized Permutation Matrices
| orig-year=2010 | arxiv=1005.0402
| bibcode=2010arXiv1005.0402N |date=2013
| journal=Annales de l'Institut Fourier |volume=63 |issue=3 |pages=773–838
| doi=10.5802/aif.2777
| url=http://www.numdam.org/item/AIF_2013__63_3_773_0/
}}
{{refend}}

{{Matrix classes}}
{{Authority control}}

[[Category:Matrices (mathematics)]]
[[Category:Permutations]]
[[Category:Sparse matrices]]