Diagonal matrix

Template:Use American English Template:Short description

In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagonal matrix is <math>\left[\begin{smallmatrix} 3 & 0 \\ 0 & 2 \end{smallmatrix}\right]</math>, while an example of a 3×3 diagonal matrix is<math> \left[\begin{smallmatrix} 6 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 4 \end{smallmatrix}\right]</math>. An identity matrix of any size, or any multiple of it is a diagonal matrix called a scalar matrix, for example, <math>\left[\begin{smallmatrix} 0.5 & 0 \\ 0 & 0.5 \end{smallmatrix}\right]</math>. In geometry, a diagonal matrix may be used as a scaling matrix, since matrix multiplication with it results in changing scale (size) and possibly also shape; only a scalar matrix results in uniform change in scale.

DefinitionEdit

As stated above, a diagonal matrix is a matrix in which all off-diagonal entries are zero. That is, the matrix Template:Math with Template:Mvar columns and Template:Mvar rows is diagonal if <math display="block">\forall i,j \in \{1, 2, \ldots, n\}, i \ne j \implies d_{i,j} = 0.</math>

However, the main diagonal entries are unrestricted.

The term diagonal matrix may sometimes refer to a Template:Visible anchor, which is an Template:Mvar-by-Template:Mvar matrix with all the entries not of the form Template:Math being zero. For example: <math display=block>\begin{bmatrix}

 1 & 0 & 0\\
 0 & 4 & 0\\
 0 & 0 & -3\\
 0 & 0 & 0\\

\end{bmatrix} \quad \text{or} \quad \begin{bmatrix}

 1 & 0 & 0 & 0 & 0\\
 0 & 4 & 0& 0 & 0\\
 0 & 0 & -3& 0 & 0

\end{bmatrix}</math>

More often, however, diagonal matrix refers to square matrices, which can be specified explicitly as a Template:Visible anchor. A square diagonal matrix is a symmetric matrix, so this can also be called a Template:Visible anchor.

The following matrix is square diagonal matrix: <math display="block">\begin{bmatrix} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & -2 \end{bmatrix}</math>

If the entries are real numbers or complex numbers, then it is a normal matrix as well.

In the remainder of this article we will consider only square diagonal matrices, and refer to them simply as "diagonal matrices".

Vector-to-matrix diag operatorEdit

A diagonal matrix Template:Math can be constructed from a vector <math>\mathbf{a} = \begin{bmatrix}a_1 & \dots & a_n\end{bmatrix}^\textsf{T}</math> using the <math>\operatorname{diag}</math> operator: <math display="block">

\mathbf{D} = \operatorname{diag}(a_1, \dots, a_n).

</math>

This may be written more compactly as <math>\mathbf{D} = \operatorname{diag}(\mathbf{a})</math>.

The same operator is also used to represent block diagonal matrices as <math>\mathbf{A} = \operatorname{diag}(\mathbf A_1, \dots, \mathbf A_n)</math> where each argument Template:Math is a matrix.

The Template:Math operator may be written as <math display="block">

\operatorname{diag}(\mathbf{a}) = \left(\mathbf{a} \mathbf{1}^\textsf{T}\right) \circ \mathbf{I},

</math> where <math>\circ</math> represents the Hadamard product, and Template:Math is a constant vector with elements 1.

Matrix-to-vector diag operatorEdit

The inverse matrix-to-vector Template:Math operator is sometimes denoted by the identically named <math>\operatorname{diag}(\mathbf{D}) = \begin{bmatrix}a_1 & \dots & a_n\end{bmatrix}^\textsf{T},</math> where the argument is now a matrix, and the result is a vector of its diagonal entries.

The following property holds: <math display="block">

\operatorname{diag}(\mathbf{A}\mathbf{B}) = \sum_j \left(\mathbf{A} \circ \mathbf{B}^\textsf{T}\right)_{ij} = \left( \mathbf{A} \circ \mathbf{B}^\textsf{T} \right) \mathbf{1}.

</math>

Scalar matrixEdit

{{ safesubst:#invoke:Unsubst||date=__DATE__|$B= Template:Ambox }} A diagonal matrix with equal diagonal entries is a scalar matrix; that is, a scalar multiple Template:Mvar of the identity matrix Template:Math. Its effect on a vector is scalar multiplication by Template:Mvar. For example, a 3×3 scalar matrix has the form: <math display="block">

 \begin{bmatrix}
   \lambda &       0 & 0       \\
         0 & \lambda & 0       \\
         0 &       0 & \lambda
 \end{bmatrix} \equiv \lambda \boldsymbol{I}_3

</math>

The scalar matrices are the center of the algebra of matrices: that is, they are precisely the matrices that commute with all other square matrices of the same size.Template:Efn By contrast, over a field (like the real numbers), a diagonal matrix with all diagonal elements distinct only commutes with diagonal matrices (its centralizer is the set of diagonal matrices). That is because if a diagonal matrix <math>\mathbf{D} = \operatorname{diag}(a_1, \dots, a_n)</math> has <math>a_i \neq a_j,</math> then given a matrix Template:Math with <math>m_{ij} \neq 0,</math> the Template:Math term of the products are: <math>(\mathbf{DM})_{ij} = a_im_{ij}</math> and <math>(\mathbf{MD})_{ij} = m_{ij}a_j,</math> and <math>a_jm_{ij} \neq m_{ij}a_i</math> (since one can divide by Template:Mvar), so they do not commute unless the off-diagonal terms are zero.Template:Efn Diagonal matrices where the diagonal entries are not all equal or all distinct have centralizers intermediate between the whole space and only diagonal matrices.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

For an abstract vector space Template:Mvar (rather than the concrete vector space Template:Mvar), the analog of scalar matrices are scalar transformations. This is true more generally for a module Template:Mvar over a ring Template:Mvar, with the endomorphism algebra Template:Math (algebra of linear operators on Template:Mvar) replacing the algebra of matrices. Formally, scalar multiplication is a linear map, inducing a map <math>R \to \operatorname{End}(M),</math> (from a scalar Template:Mvar to its corresponding scalar transformation, multiplication by Template:Mvar) exhibiting Template:Math as a Template:Mvar-algebra. For vector spaces, the scalar transforms are exactly the center of the endomorphism algebra, and, similarly, scalar invertible transforms are the center of the general linear group Template:Math. The former is more generally true free modules <math>M \cong R^n,</math> for which the endomorphism algebra is isomorphic to a matrix algebra.

Vector operationsEdit

Multiplying a vector by a diagonal matrix multiplies each of the terms by the corresponding diagonal entry. Given a diagonal matrix <math>\mathbf{D} = \operatorname{diag}(a_1, \dots, a_n)</math> and a vector <math>\mathbf{v} = \begin{bmatrix} x_1 & \dotsm & x_n \end{bmatrix}^\textsf{T}</math>, the product is: <math display="block">\mathbf{D}\mathbf{v} = \operatorname{diag}(a_1, \dots, a_n)\begin{bmatrix}x_1 \\ \vdots \\ x_n\end{bmatrix} =

 \begin{bmatrix}
   a_1 \\
       & \ddots \\
       &        & a_n
 \end{bmatrix}
 \begin{bmatrix}x_1 \\ \vdots \\ x_n\end{bmatrix} =
 \begin{bmatrix}a_1 x_1 \\ \vdots \\ a_n x_n\end{bmatrix}.

</math>

This can be expressed more compactly by using a vector instead of a diagonal matrix, <math>\mathbf{d} = \begin{bmatrix} a_1 & \dotsm & a_n \end{bmatrix}^\textsf{T}</math>, and taking the Hadamard product of the vectors (entrywise product), denoted <math>\mathbf{d} \circ \mathbf{v}</math>:

<math display="block">\mathbf{D}\mathbf{v} = \mathbf{d} \circ \mathbf{v} =

 \begin{bmatrix} a_1 \\ \vdots \\ a_n \end{bmatrix} \circ \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} =
 \begin{bmatrix} a_1 x_1 \\ \vdots \\ a_n x_n \end{bmatrix}.

</math>

This is mathematically equivalent, but avoids storing all the zero terms of this sparse matrix. This product is thus used in machine learning, such as computing products of derivatives in backpropagation or multiplying IDF weights in TF-IDF,<ref>Template:Cite book</ref> since some BLAS frameworks, which multiply matrices efficiently, do not include Hadamard product capability directly.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

Matrix operationsEdit

The operations of matrix addition and matrix multiplication are especially simple for diagonal matrices. Write Template:Math for a diagonal matrix whose diagonal entries starting in the upper left corner are Template:Math. Then, for addition, we have

 \operatorname{diag}(a_1,\, \ldots,\, a_n) + \operatorname{diag}(b_1,\, \ldots,\, b_n) = \operatorname{diag}(a_1 + b_1,\, \ldots,\, a_n + b_n)</math>

and for matrix multiplication,

<math display=block>\operatorname{diag}(a_1,\, \ldots,\, a_n) \operatorname{diag}(b_1,\, \ldots,\, b_n) = \operatorname{diag}(a_1 b_1,\, \ldots,\, a_n b_n).</math>

The diagonal matrix Template:Math is invertible if and only if the entries Template:Math are all nonzero. In this case, we have

<math display=block>\operatorname{diag}(a_1,\, \ldots,\, a_n)^{-1} = \operatorname{diag}(a_1^{-1},\, \ldots,\, a_n^{-1}).</math>

In particular, the diagonal matrices form a subring of the ring of all Template:Mvar-by-Template:Mvar matrices.

Multiplying an Template:Mvar-by-Template:Mvar matrix Template:Math from the left with Template:Math amounts to multiplying the Template:Mvar-th row of Template:Math by Template:Mvar for all Template:Mvar; multiplying the matrix Template:Math from the right with Template:Math amounts to multiplying the Template:Mvar-th column of Template:Math by Template:Mvar for all Template:Mvar.

Operator matrix in eigenbasisEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}}

As explained in determining coefficients of operator matrix, there is a special basis, Template:Math, for which the matrix Template:Math takes the diagonal form. Hence, in the defining equation <math display="inline">\mathbf{Ae}_j = \sum_i a_{i,j} \mathbf e_i</math>, all coefficients Template:Mvar with Template:Math are zero, leaving only one term per sum. The surviving diagonal elements, Template:Mvar, are known as eigenvalues and designated with Template:Mvar in the equation, which reduces to <math>\mathbf{Ae}_i = \lambda_i \mathbf e_i.</math> The resulting equation is known as eigenvalue equation<ref>Template:Cite book</ref> and used to derive the characteristic polynomial and, further, eigenvalues and eigenvectors.

In other words, the eigenvalues of Template:Math are Template:Math with associated eigenvectors of Template:Math.

PropertiesEdit

The determinant of Template:Math is the product Template:Math.
The adjugate of a diagonal matrix is again diagonal.
Where all matrices are square,
- A matrix is diagonal if and only if it is triangular and normal.
- A matrix is diagonal if and only if it is both upper- and lower-triangular.
- A diagonal matrix is symmetric.
The identity matrix Template:Math and zero matrix are diagonal.
A 1×1 matrix is always diagonal.
The square of a 2×2 matrix with zero trace is always diagonal.

ApplicationsEdit

Diagonal matrices occur in many areas of linear algebra. Because of the simple description of the matrix operation and eigenvalues/eigenvectors given above, it is typically desirable to represent a given matrix or linear map by a diagonal matrix.

In fact, a given Template:Mvar-by-Template:Mvar matrix Template:Math is similar to a diagonal matrix (meaning that there is a matrix Template:Math such that Template:Math is diagonal) if and only if it has Template:Mvar linearly independent eigenvectors. Such matrices are said to be diagonalizable.

Over the field of real or complex numbers, more is true. The spectral theorem says that every normal matrix is unitarily similar to a diagonal matrix (if Template:Math then there exists a unitary matrix Template:Math such that Template:Math is diagonal). Furthermore, the singular value decomposition implies that for any matrix Template:Math, there exist unitary matrices Template:Math and Template:Math such that Template:Math is diagonal with positive entries.

Operator theoryEdit

In operator theory, particularly the study of PDEs, operators are particularly easy to understand and PDEs easy to solve if the operator is diagonal with respect to the basis with which one is working; this corresponds to a separable partial differential equation. Therefore, a key technique to understanding operators is a change of coordinates—in the language of operators, an integral transform—which changes the basis to an eigenbasis of eigenfunctions: which makes the equation separable. An important example of this is the Fourier transform, which diagonalizes constant coefficient differentiation operators (or more generally translation invariant operators), such as the Laplacian operator, say, in the heat equation.

Especially easy are multiplication operators, which are defined as multiplication by (the values of) a fixed function–the values of the function at each point correspond to the diagonal entries of a matrix.