Min-max theorem

Template:Short description Template:Distinguish Template:Redirect-distinguish Template:More citations needed

In linear algebra and functional analysis, the min-max theorem, or variational theorem, or Courant–Fischer–Weyl min-max principle, is a result that gives a variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces. It can be viewed as the starting point of many results of similar nature.

This article first discusses the finite-dimensional case and its applications before considering compact operators on infinite-dimensional Hilbert spaces. We will see that for compact operators, the proof of the main theorem uses essentially the same idea from the finite-dimensional argument.

In the case that the operator is non-Hermitian, the theorem provides an equivalent characterization of the associated singular values. The min-max theorem can be extended to self-adjoint operators that are bounded below.

MatricesEdit

Let Template:Mvar be a Template:Math Hermitian matrix. As with many other variational results on eigenvalues, one considers the Rayleigh–Ritz quotient Template:Math defined by

where Template:Math denotes the Euclidean inner product on Template:Math. Equivalently, the Rayleigh–Ritz quotient can be replaced by

The Rayleigh quotient of an eigenvector <math>v</math> is its associated eigenvalue <math>\lambda</math> because <math>R_A(v) = (\lambda x, x)/(x, x) = \lambda</math>. For a Hermitian matrix A, the range of the continuous functions R_A(x) and f(x) is a compact interval [a, b] of the real line. The maximum b and the minimum a are the largest and smallest eigenvalue of A, respectively. The min-max theorem is a refinement of this fact.

Min-max theoremEdit

Let <math display="inline">A</math> be Hermitian on an inner product space <math display="inline">V</math> with dimension <math display="inline">n</math>, with spectrum ordered in descending order <math display="inline">\lambda_1 \geq ... \geq \lambda_n</math>.

Let <math display="inline">v_1, ..., v_n</math> be the corresponding unit-length orthogonal eigenvectors.

Reverse the spectrum ordering, so that <math display="inline">\xi_1 = \lambda_n, ..., \xi_n = \lambda_1</math>.

Template:Math theorem

Template:Math proof

Template:Math theorem \min _{\begin{array}{c} x \in \mathcal{M} \\ \|x\|=1 \end{array}}\langle x, A x\rangle\\ &=\min _{\begin{array}{c} \mathcal{M} \subset V \\ \operatorname{dim}(\mathcal{M})=n-k+1 \end{array}} \max _{\begin{array}{c} x \in \mathcal{M} \\ \|x\|=1 \end{array}}\langle x, A x\rangle \text{. } \end{aligned}</math> }}

Template:Math proof

Define the partial trace <math display="inline">tr_V(A)</math> to be the trace of projection of <math display="inline">A</math> to <math display="inline">V</math>. It is equal to <math display="inline">\sum_i v_i^*Av_i</math> given an orthonormal basis of <math display="inline">V</math>.

Template:Math theorem

Template:Hidden begin

Template:Math proof</math>. This still exists. Etc. Now since <math display="inline">dim(E) \leq n-1</math>, apply the induction hypothesis, there exists some <math display="inline">W \in X(W_1, \dots, W_k)</math> such that <math display="block">\lambda_{i_1 - (i_1-1)}(A|E)+\cdots+\lambda_{i_k- (i_1-1)}(A|E) \geq tr_W(A)

             </math> Now <math display="inline">\lambda_{i_j - (i_1-1)}(A|E)</math> is the <math display="inline">(i_j-(i_1-1))</math>-th eigenvalue of <math display="inline">A</math> orthogonally projected down to <math display="inline">E</math>. By Cauchy interlacing theorem, <math display="inline">\lambda_{i_j - (i_1-1)}(A|E) \leq \lambda_{i_j}(A)</math>. Since <math display="inline">X(W_1, \dots, W_k)\subset X(V_1, \dots, V_k)</math>, we’re done.

If <math display="inline">i_1 = 1</math>, then we perform a similar construction. Let <math display="inline">E = span(e_{2}, \dots, e_n)</math>. If <math display="inline">V_k \subset E</math>, then we can induct. Otherwise, we construct a partial flag sequence <math display="inline">W_2, \dots, W_k</math> By induction, there exists some <math display="inline">W' \in X(W_2, \dots, W_k)\subset X(V_2, \dots, V_k)</math>, such that <math display="block">\lambda_{i_2-1}(A|E)+\cdots+\lambda_{i_k-1}(A|E) \geq tr_{W'}(A)</math> thus
<math display="block">\lambda_{i_2}(A)+\cdots+\lambda_{i_k}(A) \geq tr_{W'}(A)</math> And it remains to find some <math display="inline">v</math> such that <math display="inline">W' \oplus v \in X(V_1, \dots, V_k)</math>.

If <math display="inline">V_1 \not\subset W'</math>, then any <math display="inline">v \in V_1 \setminus W'</math> would work. Otherwise, if <math display="inline">V_2 \not\subset W'</math>, then any <math display="inline">v \in V_2 \setminus W'</math> would work, and so on. If none of these work, then it means <math display="inline">V_k \subset E</math>, contradiction. }}Template:Hidden end

This has some corollaries:<ref name=":0" />Template:Pg Template:Math theorem

Template:Math theorem

Template:Hidden begin

Template:Math proof Template:Hidden end

Counterexample in the non-Hermitian caseEdit

Let N be the nilpotent matrix

<math>\begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}.</math>

Define the Rayleigh quotient <math> R_N(x) </math> exactly as above in the Hermitian case. Then it is easy to see that the only eigenvalue of N is zero, while the maximum value of the Rayleigh quotient is Template:Math. That is, the maximum value of the Rayleigh quotient is larger than the maximum eigenvalue.

ApplicationsEdit

Min-max principle for singular valuesEdit

The singular values {σ_k} of a square matrix M are the square roots of the eigenvalues of M*M (equivalently MM*). An immediate consequenceTemplate:Citation needed of the first equality in the min-max theorem is:

<math>\sigma_k^{\downarrow} = \max_{S:\dim(S)=k} \min_{x \in S, \|x\| = 1} (M^* Mx, x)^{\frac{1}{2}}=\max_{S:\dim(S)=k} \min_{x \in S, \|x\| = 1} \| Mx \|.</math>

Similarly,

<math>\sigma_k^{\downarrow} = \min_{S:\dim(S)=n-k+1} \max_{x \in S, \|x\| = 1} \| Mx \|.</math>

Here <math>\sigma_k^{\downarrow}</math> denotes the k^th entry in the decreasing sequence of the singular values, so that <math>\sigma_1^{\downarrow} \geq \sigma_2^{\downarrow} \geq \cdots </math>.

Cauchy interlacing theoremEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}} Let Template:Mvar be a symmetric n × n matrix. The m × m matrix B, where m ≤ n, is called a compression of Template:Mvar if there exists an orthogonal projection P onto a subspace of dimension m such that PAP* = B. The Cauchy interlacing theorem states:

Theorem. If the eigenvalues of Template:Mvar are Template:Math, and those of B are Template:Math, then for all Template:Math,

<math>\alpha_j \leq \beta_j \leq \alpha_{n-m+j}.</math>

This can be proven using the min-max principle. Let β_i have corresponding eigenvector b_i and S_j be the j dimensional subspace Template:Math then

<math>\beta_j = \max_{x \in S_j, \|x\| = 1} (Bx, x) = \max_{x \in S_j, \|x\| = 1} (PAP^*x, x) \geq \min_{S_j} \max_{x \in

S_j, \|x\| = 1} (A(P^*x), P^*x) = \alpha_j.</math>

According to first part of min-max, Template:Math On the other hand, if we define Template:Math then

<math>\beta_j = \min_{x \in S_{m-j+1}, \|x\| = 1} (Bx, x) = \min_{x \in S_{m-j+1}, \|x\| = 1} (PAP^*x, x)= \min_{x \in S_{m-j+1}, \|x\| = 1} (A(P^*x), P^*x) \leq \alpha_{n-m+j},</math>

where the last inequality is given by the second part of min-max.

When Template:Math, we have Template:Math, hence the name interlacing theorem.

Lidskii's inequalityEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}} Template:Math theorem

 Template:Hidden begin

Template:Math proof Template:Hidden end

Note that <math>\sum_i \lambda_i(A+B) = tr(A+B) = \sum_i \lambda_i(A) + \lambda_i(B) </math>. In other words, <math>\lambda(A+B) - \lambda(A) \preceq \lambda(B)</math> where <math>\preceq</math> means majorization. By the Schur convexity theorem, we then have

Template:Math theorem

Compact operatorsEdit

Let Template:Mvar be a compact, Hermitian operator on a Hilbert space H. Recall that the spectrum of such an operator (the set of eigenvalues) is a set of real numbers whose only possible cluster point is zero. It is thus convenient to list the positive eigenvalues of Template:Mvar as

<math>\cdots \le \lambda_k \le \cdots \le \lambda_1,</math>

where entries are repeated with multiplicity, as in the matrix case. (To emphasize that the sequence is decreasing, we may write <math>\lambda_k = \lambda_k^\downarrow</math>.) When H is infinite-dimensional, the above sequence of eigenvalues is necessarily infinite. We now apply the same reasoning as in the matrix case. Letting S_k ⊂ H be a k dimensional subspace, we can obtain the following theorem.

Theorem (Min-Max). Let Template:Mvar be a compact, self-adjoint operator on a Hilbert space Template:Mvar, whose positive eigenvalues are listed in decreasing order Template:Math. Then:

<math>\begin{align}

\max_{S_k} \min_{x \in S_k, \|x\| = 1} (Ax,x) &= \lambda_k ^{\downarrow}, \\ \min_{S_{k-1}} \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) &= \lambda_k^{\downarrow}. \end{align}</math>

A similar pair of equalities hold for negative eigenvalues.

Template:Math proof \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) \ge \lambda_k.</math>

Pick S_k−1 = span{u₁, ..., u_k−1} and we deduce

<math>\min_{S_{k-1}} \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) = \lambda_k.</math>

}}

Self-adjoint operatorsEdit

The min-max theorem also applies to (possibly unbounded) self-adjoint operators.<ref name="teschl">G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf</ref><ref name="lieb-loss">Template:Cite book</ref> Recall the essential spectrum is the spectrum without isolated eigenvalues of finite multiplicity. Sometimes we have some eigenvalues below the essential spectrum, and we would like to approximate the eigenvalues and eigenfunctions.

Theorem (Min-Max). Let A be self-adjoint, and let <math>E_1\le E_2\le E_3\le\cdots</math> be the eigenvalues of A below the essential spectrum. Then

<math>E_n=\min_{\psi_1,\ldots,\psi_{n}}\max\{\langle\psi,A\psi\rangle:\psi\in\operatorname{span}(\psi_1,\ldots,\psi_{n}), \, \| \psi \| = 1\}</math>.

If we only have N eigenvalues and hence run out of eigenvalues, then we let <math>E_n:=\inf\sigma_{ess}(A)</math> (the bottom of the essential spectrum) for n>N, and the above statement holds after replacing min-max with inf-sup.

Theorem (Max-Min). Let A be self-adjoint, and let <math>E_1\le E_2\le E_3\le\cdots</math> be the eigenvalues of A below the essential spectrum. Then

<math>E_n=\max_{\psi_1,\ldots,\psi_{n-1}}\min\{\langle\psi,A\psi\rangle:\psi\perp\psi_1,\ldots,\psi_{n-1}, \, \| \psi \| = 1\}</math>.

If we only have N eigenvalues and hence run out of eigenvalues, then we let <math>E_n:=\inf\sigma_{ess}(A)</math> (the bottom of the essential spectrum) for n > N, and the above statement holds after replacing max-min with sup-inf.

The proofs<ref name="teschl"/><ref name="lieb-loss"/> use the following results about self-adjoint operators:

Theorem. Let A be self-adjoint. Then <math>(A-E)\ge0</math> for <math>E\in\mathbb{R}</math> if and only if <math>\sigma(A)\subseteq[E,\infty)</math>.<ref name="teschl"/>Template:Rp

Theorem. If A is self-adjoint, then

<math>\inf\sigma(A)=\inf_{\psi\in\mathfrak{D}(A),\|\psi\|=1}\langle\psi,A\psi\rangle</math>

and

<math>\sup\sigma(A)=\sup_{\psi\in\mathfrak{D}(A),\|\psi\|=1}\langle\psi,A\psi\rangle</math>.<ref name="teschl"/>Template:Rp

ReferencesEdit

Template:Reflist

External links and citations to related workEdit

Template:Functional analysis Template:Analysis in topological vector spaces Template:Spectral theory

Min-max theorem

Contents

MatricesEdit

Min-max theoremEdit

Counterexample in the non-Hermitian caseEdit

ApplicationsEdit

Min-max principle for singular valuesEdit

Cauchy interlacing theoremEdit

Lidskii's inequalityEdit

Compact operatorsEdit

Self-adjoint operatorsEdit

See alsoEdit

ReferencesEdit

External links and citations to related workEdit