Editing Trace (linear algebra)

{{Short description|Sum of elements on the main diagonal}}
{{more citations needed|date=November 2023}}
In [[linear algebra]], the '''trace''' of a [[square matrix]] {{math|'''A'''}}, denoted {{math|tr('''A''')}},<ref name=":1">{{Cite web|title=Rank, trace, determinant, transpose, and inverse of matrices|url=http://fourier.eng.hmc.edu/e161/lectures/algebra/node2.html|access-date=2020-09-09|website=fourier.eng.hmc.edu}}</ref> is the sum of the elements on its [[main diagonal]], <math>a_{11} + a_{22} + \dots + a_{nn}</math>. It is only defined for a square matrix ({{math|''n'' × ''n''}}). 

The trace of a matrix is the sum of its [[eigenvalue]]s (counted with multiplicities). Also, {{math|tr('''AB''') {{=}} tr('''BA''')}} for any matrices {{math|'''A'''}} and {{math|'''B'''}} of the same size. Thus, [[Matrix similarity|similar matrices]] have the same trace. As a consequence, one can define the trace of a [[linear operator]] mapping a finite-dimensional [[vector space]] into itself, since all matrices describing such an operator with respect to a basis are similar.

The trace is related to the derivative of the [[determinant]] (see [[Jacobi's formula]]).

== Definition ==
The '''trace''' of an {{math|''n'' × ''n''}} [[square matrix]] {{math|'''A'''}} is defined as<ref name=":1"/><ref name=":2">{{cite encyclopedia |title=Trace (matrix) |last1=Weisstein |first1=Eric W. |author1-link=Eric W. Weisstein |editor1-first=Eric W. |editor1-last=Weisstein |encyclopedia=[[CRC Concise Encyclopedia of Mathematics]] |edition=2nd |orig-date=1999 |year=2003 |publisher=[[Chapman & Hall]] |location=Boca Raton, FL |isbn=1-58488-347-2|mr=1944431 |url=https://mathworld.wolfram.com/MatrixTrace.html|access-date=2020-09-09|zbl=1079.00009|doi=10.1201/9781420035223|url-access=subscription }}
</ref><ref name=LipschutzLipson>{{cite book |first1=Seymour |last1=Lipschutz |first2=Marc |last2=Lipson |date=September 2005 |title=Theory and Problems of Linear Algebra |series=Schaum's Outline |publisher=McGraw-Hill |isbn=9780070605022 }}</ref>{{rp|34}}
<math display="block">\operatorname{tr}(\mathbf{A}) = \sum_{i=1}^n a_{ii} = a_{11} + a_{22} + \dots + a_{nn}</math>
where {{math|''a<sub>ii</sub>''}} denotes the entry on the {{nobr|{{mvar|i}}&thinsp;th}} row and {{nobr|{{mvar|i}}&thinsp;th}} column of {{math|'''A'''}}. The entries of {{math|'''A'''}} can be [[real number]]s, [[complex numbers]], or more generally elements of a [[field (mathematics)|field]] {{mvar|F}}. The trace is not defined for non-square matrices.

== Example ==
Let {{math|'''A'''}} be a matrix, with
<math display="block">\mathbf{A} =
\begin{pmatrix}
  a_{11} & a_{12} & a_{13} \\
  a_{21} & a_{22} & a_{23} \\
  a_{31} & a_{32} & a_{33}
\end{pmatrix} =
\begin{pmatrix}
   1 &  0 &  3 \\
  11 &  5 &  2 \\
   6 & 12 & -5
\end{pmatrix}
</math>

Then
<math display="block">\operatorname{tr}(\mathbf{A}) = \sum_{i=1}^{3} a_{ii} = a_{11} + a_{22} + a_{33} = 1 + 5 + (-5) = 1</math>

== Properties ==

=== Basic properties ===
The trace is a [[linear operator|linear mapping]]. That is,<ref name=":1" /><ref name=":2" />
<math display="block">\begin{align}
\operatorname{tr}(\mathbf{A} + \mathbf{B}) &= \operatorname{tr}(\mathbf{A}) + \operatorname{tr}(\mathbf{B}) \\
\operatorname{tr}(c\mathbf{A}) &= c \operatorname{tr}(\mathbf{A})
\end{align}</math>
for all square matrices {{math|'''A'''}} and {{math|'''B'''}}, and all [[scalar (mathematics)|scalar]]s {{mvar|c}}.<ref name="LipschutzLipson"/>{{rp|34}}

A matrix and its [[transpose]] have the same trace:<ref name=":1" /><ref name=":2" /><ref name="LipschutzLipson"/>{{rp|34}}
<math display="block">\operatorname{tr}(\mathbf{A}) = \operatorname{tr}\left(\mathbf{A}^\mathsf{T}\right).</math>

This follows immediately from the fact that transposing a square matrix does not affect elements along the main diagonal.

=== Trace of a product ===
The trace of a square matrix which is the product of two matrices can be rewritten as the sum of entry-wise products of their elements, i.e. as the sum of all elements of their [[Hadamard product (matrices)|Hadamard product]]. Phrased directly, if {{math|'''A'''}} and {{math|'''B'''}} are two {{math|''m'' × ''n''}} matrices, then:
<math display="block">
\operatorname{tr}\left(\mathbf{A}^\mathsf{T}\mathbf{B}\right) =
\operatorname{tr}\left(\mathbf{A}\mathbf{B}^\mathsf{T}\right) =
\operatorname{tr}\left(\mathbf{B}^\mathsf{T}\mathbf{A}\right) =
\operatorname{tr}\left(\mathbf{B}\mathbf{A}^\mathsf{T}\right) =
\sum_{i=1}^m \sum_{j=1}^n a_{ij}b_{ij} \; .
</math>

If one views any real {{math|''m'' × ''n''}} matrix as a vector of length {{mvar|mn}} (an operation called [[Vectorization (mathematics)|vectorization]]) then the above operation on {{math|'''A'''}} and {{math|'''B'''}} coincides with the standard [[dot product]]. According to the above expression, {{math|tr('''A'''<sup>⊤</sup>'''A''')}} is a sum of squares and hence is nonnegative, equal to zero if and only if {{math|'''A'''}} is zero.<ref name="HornJohnson">{{cite book |title=Matrix Analysis |edition=2nd |first1=Roger A. |last1=Horn |first2=Charles R. |last2=Johnson |isbn=9780521839402 |publisher=Cambridge University Press|year=2013}}</ref>{{rp|7}} Furthermore, as noted in the above formula, {{math|tr('''A'''<sup>⊤</sup>'''B''') {{=}} tr('''B'''<sup>⊤</sup>'''A''')}}. These demonstrate the positive-definiteness and symmetry required of an [[inner product]]; it is common to call {{math|tr('''A'''<sup>⊤</sup>'''B''')}} the [[Frobenius inner product]] of {{math|'''A'''}} and {{math|'''B'''}}. This is a natural inner product on the [[vector space]] of all real matrices of fixed dimensions. The [[norm (mathematics)|norm]] derived from this inner product is called the [[Frobenius norm]], and it satisfies a submultiplicative property, as can be proven with the [[Cauchy–Schwarz inequality]]:
<math display="block">0 \leq
\left[\operatorname{tr}(\mathbf{A} \mathbf{B})\right]^2 \leq
\operatorname{tr}\left(\mathbf{A}^\mathsf{T} \mathbf{A}\right) \operatorname{tr}\left(\mathbf{B}^\mathsf{T} \mathbf{B}\right) ,</math>
if {{math|'''A'''}} and {{math|'''B'''}} are real matrices such that {{math|'''A''' '''B'''}} is a square matrix. The Frobenius inner product and norm arise frequently in [[matrix calculus]] and [[statistics]].

The Frobenius inner product may be extended to a [[hermitian inner product]] on the [[complex vector space]] of all complex matrices of a fixed size, by replacing {{math|'''B'''}} by its [[complex conjugate]].

The symmetry of the Frobenius inner product may be phrased more directly as follows: the matrices in the trace of a product can be switched without changing the result. If {{math|'''A'''}} and {{math|'''B'''}} are {{math|''m'' × ''n''}} and {{math|''n'' × ''m''}} real or complex matrices, respectively, then<ref name=":1" /><ref name=":2" /><ref name="LipschutzLipson"/>{{rp|34}}<ref group="note">This is immediate from the definition of the [[matrix product]]:
<math display="block">\operatorname{tr}(\mathbf{A}\mathbf{B}) = \sum_{i=1}^m \left(\mathbf{A}\mathbf{B}\right)_{ii} = \sum_{i=1}^m \sum_{j=1}^n a_{ij} b_{ji} = \sum_{j=1}^n \sum_{i=1}^m b_{ji} a_{ij} = \sum_{j=1}^n \left(\mathbf{B}\mathbf{A}\right)_{jj} = \operatorname{tr}(\mathbf{B}\mathbf{A}).</math>
</ref>

{{Equation box 1
|indent=:
|title=
|equation = <math>\operatorname{tr}(\mathbf{A}\mathbf{B}) = \operatorname{tr}(\mathbf{B}\mathbf{A})</math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=#F5FFFA
}}

This is notable both for the fact that {{math|'''AB'''}} does not usually equal {{math|'''BA'''}}, and also since the trace of either does not usually equal {{math|tr('''A''')tr('''B''')}}.<ref group="note">For example, if
<math display="block">
\mathbf{A} = \begin{pmatrix}
  0 & 1 \\
  0 & 0
\end{pmatrix},\quad
\mathbf{B} = \begin{pmatrix}
  0 & 0 \\
  1 & 0
\end{pmatrix},
</math>

then the product is
<math display="block">\mathbf{AB} = \begin{pmatrix}
  1 & 0 \\
  0 & 0
\end{pmatrix},</math>
and the traces are {{math|tr('''AB''') {{=}} 1 ≠ 0 ⋅ 0 {{=}} tr('''A''')tr('''B''')}}.</ref> The [[similarity invariance|similarity-invariance]] of the trace, meaning that {{math|tr('''A''') {{=}} tr('''P'''<sup>−1</sup>'''AP''')}} for any square matrix {{math|'''A'''}} and any invertible matrix {{math|'''P'''}} of the same dimensions, is a fundamental consequence. This is proved by
<math display="block">
\operatorname{tr}\left(\mathbf{P}^{-1}(\mathbf{A}\mathbf{P})\right) =
\operatorname{tr}\left((\mathbf{A} \mathbf{P})\mathbf{P}^{-1}\right) =
\operatorname{tr}(\mathbf{A}).
</math>
Similarity invariance is the crucial property of the trace in order to discuss traces of [[linear transformation]]s as below.

Additionally, for real column vectors <math>\mathbf{a}\in\mathbb{R}^n</math> and <math>\mathbf{b}\in\mathbb{R}^n</math>, the trace of the outer product is equivalent to the inner product:
{{Equation box 1
|indent=:
|title=
|equation = <math>\operatorname{tr}\left(\mathbf{b}\mathbf{a}^\textsf{T}\right) = \mathbf{a}^\textsf{T}\mathbf{b}</math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=#F5FFFA
}}

=== Cyclic property ===
More generally, the trace is ''invariant under [[circular shift]]s'', that is,

{{Equation box 1
|indent=:
|title=
|equation = <math>\operatorname{tr}(\mathbf{A}\mathbf{B}\mathbf{C}\mathbf{D}) = \operatorname{tr}(\mathbf{B}\mathbf{C}\mathbf{D}\mathbf{A}) = \operatorname{tr}(\mathbf{C}\mathbf{D}\mathbf{A}\mathbf{B}) = \operatorname{tr}(\mathbf{D}\mathbf{A}\mathbf{B}\mathbf{C}).</math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=#F5FFFA}}

This is known as the ''cyclic property''.

Arbitrary permutations are not allowed: in general,
<math display="block">\operatorname{tr}(\mathbf{A}\mathbf{B}\mathbf{C}\mathbf{D}) \ne \operatorname{tr}(\mathbf{A}\mathbf{C}\mathbf{B}\mathbf{D}) ~.</math>

However, if products of ''three'' [[symmetric matrix|symmetric]] matrices are considered, any permutation is allowed, since:
<math display="block">\operatorname{tr}(\mathbf{A}\mathbf{B}\mathbf{C}) = \operatorname{tr}\left(\left(\mathbf{A}\mathbf{B}\mathbf{C}\right)^{\mathsf T}\right) = \operatorname{tr}(\mathbf{C}\mathbf{B}\mathbf{A}) = \operatorname{tr}(\mathbf{A}\mathbf{C}\mathbf{B}),</math>
where the first equality is because the traces of a matrix and its transpose are equal. Note that this is not true in general for more than three factors.

=== Trace of a Kronecker product ===
The trace of the [[Kronecker product]] of two matrices is the product of their traces:
<math display="block">\operatorname{tr}(\mathbf{A} \otimes \mathbf{B}) = \operatorname{tr}(\mathbf{A})\operatorname{tr}(\mathbf{B}).</math>

===Characterization of the trace===
The following three properties:
<math display="block">\begin{align}
\operatorname{tr}(\mathbf{A} + \mathbf{B}) &= \operatorname{tr}(\mathbf{A}) + \operatorname{tr}(\mathbf{B}), \\
\operatorname{tr}(c\mathbf{A}) &= c \operatorname{tr}(\mathbf{A}), \\
\operatorname{tr}(\mathbf{A}\mathbf{B}) &= \operatorname{tr}(\mathbf{B}\mathbf{A}),
\end{align}</math>
characterize the trace [[up to]] a scalar multiple in the following sense: If <math>f</math> is a [[linear functional]] on the space of square matrices that satisfies <math>f(xy) = f(yx),</math> then <math>f</math> and <math>\operatorname{tr}</math> are proportional.<ref group="note">Proof: Let <math>e_{ij}</math> the standard basis and note that <math>f\left(e_{ij}\right) = f\left(e_{i} e_{j}^\top\right) = f\left(e_i e_1^\top e_1 e_j^\top\right) = f\left(e_1 e_j^\top e_i e_1^\top\right) = f\left(0\right) = 0</math> if  <math>i \neq j</math> and <math>f\left(e_{jj}\right) = f\left(e_{11}\right)</math>
<math display="block">f(\mathbf{A}) = \sum_{i, j} [\mathbf{A}]_{ij} f\left(e_{ij}\right) = \sum_i [\mathbf{A}]_{ii} f\left(e_{11}\right) = f\left(e_{11}\right) \operatorname{tr}(\mathbf{A}).</math>

More abstractly, this corresponds to the decomposition
<math display="block">\mathfrak{gl}_n = \mathfrak{sl}_n \oplus k,</math>
as <math>\operatorname{tr}(AB) = \operatorname{tr}(BA)</math> (equivalently, <math>\operatorname{tr}([A, B]) = 0</math>) defines the trace on <math>\mathfrak{sl}_n,</math> which has complement the scalar matrices, and leaves one degree of freedom: any such map is determined by its value on scalars, which is one scalar parameter and hence all are multiple of the trace, a nonzero such map.</ref>

For <math>n\times n</math> matrices, imposing the normalization <math>f(\mathbf{I}) = n</math> makes <math>f</math> equal to the trace.

===Trace as the sum of eigenvalues===
Given any {{math|''n'' × ''n''}} matrix {{math|'''A'''}}, there is

{{Equation box 1
|indent=:
|title=
|equation = <math>\operatorname{tr}(\mathbf{A}) = \sum_{i=1}^n \lambda_i</math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=#F5FFFA}}

where {{math|&lambda;<sub>1</sub>, ..., &lambda;<sub>''n''</sub>}} are the [[eigenvalue]]s of {{math|'''A'''}} counted with multiplicity. This holds true even if {{math|'''A'''}} is a real matrix and some (or all) of the eigenvalues are complex numbers. This may be regarded as a consequence of the existence of the [[Jordan canonical form]], together with the similarity-invariance of the trace discussed above.

===Trace of commutator===
When both {{math|'''A'''}} and {{math|'''B'''}} are {{math|''n'' × ''n''}} matrices, the trace of the (ring-theoretic) [[commutator]] of {{math|'''A'''}} and {{math|'''B'''}} vanishes: {{math|1=tr(['''A''', '''B''']) = 0}}, because {{math|1=tr('''AB''') = tr('''BA''')}} and {{math|tr}} is linear. One can state this as "the trace is a map of [[Lie algebras]] {{math|gl<sub>''n''</sub> → ''k''}} from operators to scalars", as the commutator of scalars is trivial (it is an [[Abelian Lie algebra]]). In particular, using similarity invariance, it follows that the identity matrix is never similar to the commutator of any pair of matrices.

Conversely, any square matrix with zero trace is a linear combination of the commutators of pairs of matrices.<ref group="note">Proof: <math>\mathfrak{sl}_n</math> is a [[semisimple Lie algebra]] and thus every element in it is a linear combination of commutators of some pairs of elements, otherwise the [[derived algebra]] would be a proper ideal.</ref> Moreover, any square matrix with zero trace is [[Unitary representation|unitarily equivalent]] to a square matrix with diagonal consisting of all zeros.

===Traces of special kinds of matrices===
{{bulleted list
| The trace of the {{math|''n'' × ''n''}} [[identity matrix]] is the dimension of the space, namely {{mvar|n}}.
<math display="block">\operatorname{tr}\left(\mathbf{I}_n\right) = n</math>
This leads to [[Dimension (vector space)#Trace|generalizations of dimension using trace]].
| The trace of a [[Hermitian matrix]] is real, because the elements on the diagonal are real.
| The trace of a [[permutation matrix]] is the number of [[Fixed point (mathematics)|fixed points]] of the corresponding permutation, because the diagonal term {{math|''a''<sub>''ii''</sub>}} is 1 if the {{math|''i''}}th point is fixed and 0 otherwise.
| The trace of a [[Projection_(linear_algebra)|projection matrix]] is the dimension of the target space.
<math display="block">\begin{align}
                                  \mathbf{P}_\mathbf{X} &= \mathbf{X}\left(\mathbf{X}^\mathsf{T} \mathbf{X}\right)^{-1} \mathbf{X}^\mathsf{T} \\[3pt]
\Longrightarrow
    \operatorname{tr}\left(\mathbf{P}_\mathbf{X}\right) &= \operatorname{rank}(\mathbf{X}).
\end{align}</math>
The matrix {{math|'''P<sub>X</sub>'''}} is idempotent.
| More generally, the trace of any [[idempotent matrix]], i.e. one with {{math|1='''A'''<sup>2</sup> = '''A'''}}, equals its own [[rank (linear algebra)|rank]].
| The trace of a [[nilpotent matrix]] is zero.
{{pb}}
When the characteristic of the base field is zero, the converse also holds: if {{math|1=tr('''A'''<sup>''k''</sup>) = 0}} for all {{mvar|k}}, then {{math|'''A'''}} is nilpotent.
{{pb}}
When the characteristic {{math|''n'' > 0}} is positive, the identity in {{mvar|n}} dimensions is a counterexample, as <math>\operatorname{tr}\left(\mathbf{I}_n^k\right) = \operatorname{tr}\left(\mathbf{I}_n\right) = n \equiv 0</math>, but the identity is not nilpotent.
}}

=== Relationship to the characteristic polynomial ===

The trace of an <math>n \times n</math> matrix <math>A</math> is the coefficient of <math>t^{n-1}</math> in the [[characteristic polynomial]], possibly changed of sign, according to the convention in the definition of the characteristic polynomial.

== Relationship to eigenvalues ==
If {{math|'''A'''}} is a linear operator represented by a square matrix with [[real number|real]] or [[complex number|complex]] entries and if {{math|''λ''<sub>1</sub>, ..., ''λ<sub>n</sub>''}} are the [[eigenvalue]]s of {{math|'''A'''}} (listed according to their [[algebraic multiplicity|algebraic multiplicities]]), then

{{Equation box 1
|indent=:
|title=
|equation = <math>\operatorname{tr}(\mathbf{A}) = \sum_i \lambda_i</math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=#F5FFFA}}

This follows from the fact that {{math|'''A'''}} is always [[similar matrix|similar]] to its [[Jordan form]], an upper [[triangular matrix]] having {{math|''λ''<sub>1</sub>, ..., ''λ<sub>n</sub>''}} on the main diagonal. In contrast, the [[determinant]] of {{math|'''A'''}} is the ''product'' of its eigenvalues; that is,
<math display="block">\det(\mathbf{A}) = \prod_i \lambda_i.</math>

Everything in the present section applies as well to any square matrix with coefficients in an [[algebraically closed field]].

=== Derivative relationships ===
If {{math|'''ΔA'''}} is a square matrix with small entries and {{math|'''I'''}} denotes the [[identity matrix]], then we have approximately

<math display="block">\det(\mathbf{I}+\mathbf{\Delta A})\approx 1 + \operatorname{tr}(\mathbf{\Delta A}).</math>

Precisely this means that the trace is the [[derivative]] of the [[determinant]] function at the identity matrix. [[Jacobi's formula]] 

<math display="block">d\det(\mathbf{A}) = \operatorname{tr} \big(\operatorname{adj}(\mathbf{A})\cdot d\mathbf{A}\big)</math>

is more general and describes the [[Differential (infinitesimal)|differential]] of the determinant at an arbitrary square matrix, in terms of the trace and the [[Adjugate matrix|adjugate]] of the matrix.

From this (or from the connection between the trace and the eigenvalues), one can derive a relation between the trace function, the [[matrix exponential]] function, and the determinant:<math display="block">\det(\exp(\mathbf{A})) = \exp(\operatorname{tr}(\mathbf{A})).</math>

A related characterization of the trace applies to linear [[vector field]]s. Given a matrix {{math|'''A'''}}, define a vector field {{math|'''F'''}} on {{math|'''R'''<sup>''n''</sup>}} by {{math|1='''F'''('''x''') = '''Ax'''}}. The components of this vector field are linear functions (given by the rows of {{math|'''A'''}}). Its [[divergence]] {{math|div '''F'''}} is a constant function, whose value is equal to {{math|tr('''A''')}}.

By the [[divergence theorem]], one can interpret this in terms of flows: if {{math|'''F'''('''x''')}} represents the velocity of a fluid at location {{math|'''x'''}} and {{mvar|U}} is a region in {{math|'''R'''<sup>''n''</sup>}}, the [[flow network|net flow]] of the fluid out of {{mvar|U}} is given by {{math|tr('''A''') · vol(''U'')}}, where {{math|vol(''U'')}} is the [[volume]] of {{mvar|U}}.

The trace is a linear operator, hence it commutes with the derivative:
<math display="block">d \operatorname{tr} (\mathbf{X}) = \operatorname{tr}(d\mathbf{X}) .</math>

== Trace of a linear operator ==
In general, given some linear map {{math|''f'' : ''V'' → ''V''}} (where {{mvar|V}} is a finite-[[dimension (linear algebra)|dimensional]] [[vector space]]), we can define the trace of this map by considering the trace of a [[Representation theory|matrix representation]] of {{mvar|f}}, that is, choosing a [[basis (linear algebra)|basis]] for {{mvar|V}} and describing {{mvar|f}} as a matrix relative to this basis, and taking the trace of this square matrix. The result will not depend on the basis chosen, since different bases will give rise to [[matrix similarity|similar matrices]], allowing for the possibility of a basis-independent definition for the trace of a linear map.

Such a definition can be given using the [[natural isomorphism|canonical isomorphism]] between the space {{math|End(''V'')}} of linear maps on {{mvar|V}} and {{math|''V'' ⊗ ''V''*}}, where {{math|''V''*}} is the [[dual space]] of {{mvar|V}}. Let {{mvar|v}} be in {{mvar|V}} and let {{mvar|g}} be in {{mvar|''V''*}}. Then the trace of the indecomposable element {{math|''v'' ⊗ ''g''}} is defined to be {{math|''g''(''v'')}}; the trace of a general element is defined by linearity. The trace of  a linear map {{math|''f'' : ''V'' → ''V''}} can then be defined as the trace, in the above sense, of the element of {{math|''V'' ⊗ ''V''*}} corresponding to ''f'' under the above mentioned canonical isomorphism. Using an explicit basis for {{mvar|V}} and the corresponding dual basis for {{math|''V''*}}, one can show that this gives the same definition of the trace as given above.

== Numerical algorithms ==

=== Stochastic estimator ===
The trace can be estimated unbiasedly by "Hutchinson's trick":<ref>{{Cite journal |last=Hutchinson |first=M.F. |date=January 1989 |title=A Stochastic Estimator of the Trace of the Influence Matrix for Laplacian Smoothing Splines |url=http://www.tandfonline.com/doi/abs/10.1080/03610918908812806 |journal=Communications in Statistics - Simulation and Computation |language=en |volume=18 |issue=3 |pages=1059–1076 |doi=10.1080/03610918908812806 |issn=0361-0918|url-access=subscription }}</ref><blockquote>Given any matrix <math>\boldsymbol W\in \R^{n\times n}</math>, and any random <math>\boldsymbol u\in \R^n</math> with <math>\mathbb E[\boldsymbol u\boldsymbol u^\intercal] = \mathbf I</math>, we have <math>\mathbb E[\boldsymbol u^\intercal\boldsymbol W\boldsymbol u ] = \operatorname{tr}\boldsymbol W</math>. </blockquote>

For a proof expand the expectation directly. 

Usually, the random vector is sampled from <math>\operatorname N(\mathbf 0,\mathbf I)</math> (normal distribution) or <math>\{\pm n^{-1/2}\}^n</math> ([[Rademacher distribution]]).

More sophisticated stochastic estimators of trace have been developed.<ref>{{Cite journal |last1=Avron |first1=Haim |last2=Toledo |first2=Sivan |date=2011-04-11 |title=Randomized algorithms for estimating the trace of an implicit symmetric positive semi-definite matrix |url=https://doi.org/10.1145/1944345.1944349 |journal=Journal of the ACM |volume=58 |issue=2 |pages=8:1–8:34 |doi=10.1145/1944345.1944349 |s2cid=5827717 |issn=0004-5411}}</ref>

== Applications ==
If a 2 x 2 real matrix has zero trace, its square is a [[diagonal matrix]].

The trace of a 2&nbsp;×&nbsp;2 [[complex matrix]] is used to classify [[Möbius transformation]]s. First, the matrix is normalized to make its [[determinant]] equal to one. Then, if the square of the trace is 4, the corresponding transformation is ''parabolic''. If the square is in the interval {{nowrap|[0,4)}}, it is ''elliptic''. Finally, if the square is greater than 4, the transformation is ''loxodromic''. See [[Möbius transformation#Classification|classification of Möbius transformations]].

The trace is used to define [[character (mathematics)|characters]] of [[group representation]]s. Two representations {{math|'''A''', '''B''' : ''G'' → ''GL''(''V'')}} of a group {{mvar|G}} are equivalent (up to change of basis on {{mvar|V}}) if {{math|1=tr('''A'''(''g'')) = tr('''B'''(''g''))}} for all {{math|''g'' ∈ ''G''}}.

The trace also plays a central role in the distribution of [[Quadratic form (statistics)|quadratic forms]].

== Lie algebra ==
The trace is a map of Lie algebras <math>\operatorname{tr}:\mathfrak{gl}_n\to K</math> from the Lie algebra <math>\mathfrak{gl}_n</math> of linear operators on an {{mvar|n}}-dimensional space ({{math|''n'' × ''n''}} matrices with entries in <math>K</math>) to the Lie algebra {{mvar|K}} of scalars; as {{mvar|K}} is Abelian (the Lie bracket vanishes), the fact that this is a map of Lie algebras is exactly the statement that the trace of a bracket vanishes:
<math display="block">\operatorname{tr}([\mathbf{A}, \mathbf{B}]) = 0 \text{ for each }\mathbf A,\mathbf B\in\mathfrak{gl}_n.</math>

The kernel of this map, a matrix whose trace is [[0 (number)|zero]], is often said to be '''{{visible anchor|traceless}}''' or '''{{visible anchor|trace free}}''', and these matrices form the [[simple Lie algebra]] <math>\mathfrak{sl}_n</math>, which is the [[Lie algebra]] of the [[special linear group]] of matrices with determinant 1. The special linear group consists of the matrices which do not change volume, while the [[special linear Lie algebra]] is the matrices which do not alter volume of ''infinitesimal'' sets.

In fact, there is an internal [[direct sum of Lie algebras|direct sum]] decomposition <math>\mathfrak{gl}_n = \mathfrak{sl}_n \oplus K</math> of operators/matrices into traceless operators/matrices and scalars operators/matrices. The projection map onto scalar operators can be expressed in terms of the trace, concretely as:
<math display="block">\mathbf{A} \mapsto \frac{1}{n}\operatorname{tr}(\mathbf{A})\mathbf{I}.</math>

Formally, one can compose the trace (the [[counit]] map) with the unit map <math>K\to\mathfrak{gl}_n</math> of "inclusion of [[scalar transformation|scalars]]" to obtain a map <math>\mathfrak{gl}_n\to\mathfrak{gl}_n</math> mapping onto scalars, and multiplying by {{mvar|n}}. Dividing by {{mvar|n}} makes this a projection, yielding the formula above.

In terms of [[short exact sequence]]s, one has
<math display="block">0 \to \mathfrak{sl}_n \to \mathfrak{gl}_n \overset{\operatorname{tr}}{\to} K \to 0</math>
which is analogous to
<math display="block">1 \to \operatorname{SL}_n \to \operatorname{GL}_n \overset{\det}{\to} K^* \to 1</math>
(where <math>K^*=K\setminus\{0\}</math>) for [[Lie group]]s. However, the trace splits naturally (via <math>1/n</math> times scalars) so <math>\mathfrak{gl}_n=\mathfrak{sl}_n\oplus K</math>, but the splitting of the determinant would be as the {{mvar|n}}th root times scalars, and this does not in general define a function, so the determinant does not split and the [[general linear group]] does not decompose:
<math display="block">\operatorname{GL}_n \neq \operatorname{SL}_n \times K^*.</math>

=== Bilinear forms ===
The [[bilinear form]] (where {{math|'''X'''}}, {{math|'''Y'''}} are square matrices)
<math display="block">B(\mathbf{X}, \mathbf{Y}) = \operatorname{tr}(\operatorname{ad}(\mathbf{X})\operatorname{ad}(\mathbf{Y}))</math>
: where <math>\operatorname{ad}(\mathbf{X})\mathbf{Y} = [\mathbf{X}, \mathbf{Y}] = \mathbf{X}\mathbf{Y} - \mathbf{Y}\mathbf{X}</math>
: and for orientation, if <math>\operatorname{det} \mathbf{Y} \ne 0 </math>
:: then <math>\operatorname{ad}(\mathbf{X}) = \mathbf{X} - \mathbf{Y}\mathbf{X}\mathbf{Y}^{-1} ~.</math>

<math> B(\mathbf{X}, \mathbf{Y})</math> is called the [[Killing form]]; it is used to classify [[Lie algebra]]s.

The trace defines a bilinear form:
<math display="block">(\mathbf{X}, \mathbf{Y}) \mapsto \operatorname{tr}(\mathbf{X}\mathbf{Y}) ~.</math>

The form is symmetric, non-degenerate<ref group=note>This follows from the fact that {{math|1=tr('''A'''*'''A''') = 0}} [[if and only if]] {{math|1='''A''' = '''0'''}}.</ref> and associative in the sense that:
<math display="block">\operatorname{tr}(\mathbf{X}[\mathbf{Y}, \mathbf{Z}]) = \operatorname{tr}([\mathbf{X}, \mathbf{Y}]\mathbf{Z}).</math>

For a complex simple Lie algebra (such as {{math|<math>\mathfrak{sl}</math><sub>''n''</sub>}}), every such bilinear form is proportional to each other; in particular, to the Killing form{{Citation needed|reason=Either a source or proof is needed|date=June 2022}}.

Two matrices {{math|'''X'''}} and {{math|'''Y'''}} are said to be ''trace orthogonal'' if
<math display="block">\operatorname{tr}(\mathbf{X}\mathbf{Y}) = 0.</math>

There is a generalization to a general representation <math>(\rho,\mathfrak{g},V)</math> of a Lie algebra <math>\mathfrak{g}</math>, such that <math>\rho</math> is a homomorphism of Lie algebras <math>\rho: \mathfrak{g} \rightarrow \text{End}(V).</math> The trace form <math>\text{tr}_V</math> on <math>\text{End}(V)</math> is defined as above. The bilinear form
<math display="block">\phi(\mathbf{X},\mathbf{Y}) = \text{tr}_V(\rho(\mathbf{X})\rho(\mathbf{Y}))</math>
is symmetric and invariant due to cyclicity.

== Generalizations ==
The concept of trace of a matrix is generalized to the [[trace class]] of [[compact operator]]s on [[Hilbert space]]s, and the analog of the [[Frobenius norm]] is called the [[Hilbert–Schmidt operator|Hilbert–Schmidt]] norm.

If <math>K</math> is a trace-class operator, then for any [[orthonormal basis]] <math>\{e_n\}_{n=1}</math>, the trace is given by
<math display="block">\operatorname{tr}(K) = \sum_n \left\langle e_n, Ke_n \right\rangle,</math>
and is finite and independent of the orthonormal basis.<ref>{{cite book | first=G. | last=Teschl | title=Mathematical Methods in Quantum Mechanics | series=Graduate Studies in Mathematics | volume=157 | date=30 October 2014 | publisher=American Mathematical Society | isbn=978-1470417048 | edition=2nd}}</ref>

The [[partial trace]] is another generalization of the trace that is operator-valued. The trace of a linear operator <math>Z</math> which lives on a product space <math>A\otimes B</math> is equal to the partial traces over <math>A</math> and <math>B</math>:
<math display="block">\operatorname{tr}(Z) = \operatorname{tr}_A \left(\operatorname{tr}_B(Z)\right) = \operatorname{tr}_B \left(\operatorname{tr}_A(Z)\right).</math>

For more properties and a generalization of the partial trace, see [[Traced monoidal category|traced monoidal categories]].

If <math>A</math> is a general [[associative algebra]] over a field <math>k</math>, then a trace on <math>A</math> is often defined to be any [[linear functional|functional]] <math>\operatorname{tr}:A\to k</math> which vanishes on commutators; <math>\operatorname{tr}([a,b])=0</math> for all <math>a,b\in A</math>. Such a trace is not uniquely defined; it can always at least be modified by multiplication by a nonzero scalar.

A [[supertrace]] is the generalization of a trace to the setting of [[superalgebra]]s.

The operation of [[tensor contraction]] generalizes the trace to arbitrary tensors.

Gomme and Klein (2011) define a matrix trace operator <math>\operatorname{trm}</math> that operates on [[block matrix|block matrices]] and use it to compute second-order perturbation solutions to dynamic economic models without the need for [[tensor notation]].<ref>{{cite journal |author=P. Gomme, P. Klein |title=Second-order approximation of dynamic models without the use of tensors |journal=Journal of Economic Dynamics & Control |volume=35 |year=2011 |issue=4 |pages=604–615 |doi=10.1016/j.jedc.2010.10.006 }}</ref>

==Traces in the language of tensor products==
Given a vector space {{mvar|V}}, there is a natural bilinear map {{math|''V'' × ''V''<sup>∗</sup> → ''F''}} given by sending {{math|(''v'', &phi;)}} to the scalar {{math|&phi;(''v'')}}. The [[tensor product#Universal property|universal property]] of the [[tensor product]] {{math|''V'' ⊗ ''V''<sup>∗</sup>}} automatically implies that this bilinear map is induced by a linear functional on {{math|''V'' ⊗ ''V''<sup>∗</sup>}}.<ref name="kassel">{{cite book|last1=Kassel|first1=Christian|title=Quantum groups|series=[[Graduate Texts in Mathematics]]|volume=155|publisher=[[Springer-Verlag]]|location=New York|year=1995|isbn=0-387-94370-6|mr=1321145|doi=10.1007/978-1-4612-0783-2|zbl=0808.17003}}</ref>

Similarly, there is a natural bilinear map {{math|''V'' × ''V''<sup>∗</sup> → Hom(''V'', ''V'')}} given by sending {{math|(''v'', &phi;)}} to the linear map {{math|''w'' ↦ &phi;(''w'')''v''}}. The universal property of the tensor product, just as used previously, says that this bilinear map is induced by a linear map {{math|''V'' ⊗ ''V''<sup>∗</sup> → Hom(''V'', ''V'')}}. If {{mvar|V}} is finite-dimensional, then this linear map is a [[linear isomorphism]].<ref name="kassel" /> This fundamental fact is a straightforward consequence of the existence of a (finite) basis of {{mvar|V}}, and can also be phrased as saying that any linear map {{math|''V'' → ''V''}} can be written as the sum of (finitely many) rank-one linear maps. Composing the inverse of the isomorphism with the linear functional obtained above results in a linear functional on {{math|Hom(''V'', ''V'')}}. This linear functional is exactly the same as the trace.

Using the definition of trace as the sum of diagonal elements, the matrix formula {{math|tr('''AB''') {{=}} tr('''BA''')}} is straightforward to prove, and was given above. In the present perspective, one is considering linear maps {{mvar|S}} and {{mvar|T}}, and viewing them as sums of rank-one maps, so that there are linear functionals {{math|''&phi;''<sub>''i''</sub>}} and {{math|''&psi;''<sub>''j''</sub>}} and nonzero vectors {{math|''v''<sub>''i''</sub>}} and {{math|''w''<sub>''j''</sub>}} such that {{math|''S''({{mvar|u}}) {{=}} Σ''&phi;''<sub>''i''</sub>(''u'')''v''<sub>''i''</sub>}} and {{math|''T''({{mvar|u}}) {{=}} Σ''&psi;''<sub>''j''</sub>(''u'')''w''<sub>''j''</sub>}} for any {{mvar|u}} in {{mvar|V}}. Then
:<math>(S\circ T)(u)=\sum_i\varphi_i\left(\sum_j\psi_j(u)w_j\right)v_i=\sum_i\sum_j\psi_j(u)\varphi_i(w_j)v_i </math>
for any {{mvar|u}} in {{mvar|V}}. The rank-one linear map {{math|''u'' ↦ ''&psi;''<sub>''j''</sub>(''u'')''&phi;''<sub>''i''</sub>(''w''<sub>''j''</sub>)''v''<sub>''i''</sub>}} has trace {{math|''&psi;''<sub>''j''</sub>(''v''<sub>''i''</sub>)''&phi;''<sub>''i''</sub>(''w''<sub>''j''</sub>)}} and so
:<math>\operatorname{tr}(S\circ T)=\sum_i\sum_j\psi_j(v_i)\varphi_i(w_j)=\sum_j\sum_i\varphi_i(w_j)\psi_j(v_i).</math>
Following the same procedure with {{mvar|S}} and {{mvar|T}} reversed, one finds exactly the same formula, proving that {{math|tr(''S'' ∘ ''T'')}} equals {{math|tr(''T'' ∘ ''S'')}}.

The above proof can be regarded as being based upon tensor products, given that the fundamental identity of {{math|End(''V'')}} with {{math|''V'' ⊗ ''V''<sup>∗</sup>}} is equivalent to the expressibility of any linear map as the sum of rank-one linear maps. As such, the proof may be written in the notation of tensor products. Then one may consider the multilinear map {{math|''V'' × ''V''<sup>∗</sup> × ''V'' × ''V''<sup>∗</sup> → ''V'' ⊗ ''V''<sup>∗</sup>}} given by sending {{math|(''v'', ''&phi;'', ''w'', ''&psi;'')}} to  {{math|''&phi;''(''w'')''v'' ⊗ ''&psi;''}}. Further composition with the trace map then results in {{math|''&phi;''(''w'')''&psi;''(''v'')}}, and this is unchanged if one were to have started with {{math|(''w'', ''&psi;'', ''v'', ''&phi;'')}} instead. One may also consider the bilinear map {{math|End(''V'') × End(''V'') → End(''V'')}} given by sending {{math|(''f'', ''g'')}} to the composition {{math|''f'' ∘ ''g''}}, which is then induced by a linear map {{math|End(''V'') ⊗ End(''V'') → End(''V'')}}. It can be seen that this coincides with the linear map {{math|''V'' ⊗ ''V''<sup>∗</sup> ⊗ ''V'' ⊗ ''V''<sup>∗</sup> → ''V'' ⊗ ''V''<sup>∗</sup>}}. The established symmetry upon composition with the trace map then establishes the equality of the two traces.<ref name="kassel" />

For any finite dimensional vector space {{mvar|V}}, there is a natural linear map {{math|''F'' → ''V'' ⊗ ''V''{{'}}}}; in the language of linear maps, it assigns to a scalar {{mvar|c}} the linear map {{math|''c''⋅id<sub>''V''</sub>}}. Sometimes this is called ''coevaluation map'', and the trace {{math|''V'' ⊗ ''V''{{'}} → ''F''}} is called ''evaluation map''.<ref name="kassel" /> These structures can be axiomatized to define [[categorical trace]]s in the abstract setting of [[category theory]].

== See also ==
* [[Scalar curvature#Definition|Trace of a tensor with respect to a metric tensor]]
* [[Characteristic function (probability theory)#Matrix-valued random variables|Characteristic function]]
* [[Field trace]]
* [[Golden–Thompson inequality]]
* [[Singular trace]]
* [[Specht's theorem]]
* [[Trace class]]
* [[Trace identity]]
* [[Trace inequalities]]
* [[von Neumann's trace inequality]]

== Notes ==
{{reflist|group=note}}

== References ==
{{reflist|25em}}

{{refbegin|25em|small=yes}}

* {{cite book
 |last = Gantmacher  |first = F.R.  |author-link = Felix Gantmacher
 |year = 1959
 |title=The Theory of Matrices
 |translator-first = K.A. |translator-last = Hirsch |translator-link = Kurt Hirsch
 |location = New York, NY
 |publisher = [[Chelsea Publishing Company]]
 |mr=0107649
}}

* {{cite book
 |mr=2978290
 |last1 = Horn    |first1 = R.A.  |author1-link = Roger Horn
 |last2 = Johnson |first2 = C.R.  |author2-link = Charles Royal Johnson
 |year=2013  |orig-year = 1985
 |title=Matrix Analysis |edition=2nd
 |publisher = [[Cambridge University Press]]
 |location = Cambridge, UK
 |isbn = 978-0-521-54823-6
}}

* {{cite book
 |last = Strang  |first = G.  |author-link = Gilbert Strang
 |year = 2004  |orig-year = 1976  |edition=4th
 |title = Linear Algebra and its Applications
 |publisher = [[Cengage Learning]]
 |isbn = 978-003010567-8
}}

{{refend}}

== External links ==
* {{springer|title=Trace of a square matrix|id=p/t093550}}

{{DEFAULTSORT:Trace (Linear Algebra)}}
[[Category:Linear algebra]]
[[Category:Matrix theory]]
[[Category:Trace theory]]