Editing Min-max theorem

{{short description|Variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces}}
{{distinguish|Minimax theorem}}
{{redirect-distinguish|Variational theorem|variational principle}}
{{More citations needed|date=November 2011}}

In [[linear algebra]] and [[functional analysis]], the '''min-max theorem''', or '''variational theorem''', or '''Courant&ndash;Fischer&ndash;Weyl min-max principle''', is a result that gives a variational characterization of [[Eigenvalues and eigenvectors|eigenvalues]] of [[Compact operator on Hilbert space|compact]] Hermitian operators on [[Hilbert spaces]]. It can be viewed as the starting point of many results of similar nature.

This article first discusses the finite-dimensional case and its applications before considering compact operators on infinite-dimensional Hilbert spaces. 
We will see that for compact operators, the proof of the main theorem uses essentially the same idea from the finite-dimensional argument.

In the case that the operator is non-Hermitian, the theorem provides an equivalent characterization of the associated [[singular values]]. 
The min-max theorem can be extended to [[self-adjoint operator]]s that are bounded below.

== Matrices ==

Let {{mvar|A}} be a {{math|''n'' × ''n''}} [[Hermitian matrix]]. As with many other variational results on eigenvalues, one considers the [[Rayleigh quotient|Rayleigh&ndash;Ritz quotient]] {{math|''R<sub>A</sub>'' : '''C'''<sup>''n''</sup> \ {0} → '''R'''}} defined by

:<math>R_A(x) = \frac{(Ax, x)}{(x,x)}</math>

where {{math|(⋅, ⋅)}} denotes the [[dot product|Euclidean inner product]] on {{math|'''C'''<sup>''n''</sup>}}. 
Equivalently, the Rayleigh&ndash;Ritz quotient can be replaced by

:<math>f(x) = (Ax, x), \; \|x\| = 1.</math>

The Rayleigh quotient of an eigenvector <math>v</math> is its associated eigenvalue <math>\lambda</math> because <math>R_A(v) =  (\lambda x, x)/(x, x) = \lambda</math>. 
For a Hermitian matrix ''A'', the range of the continuous functions ''R<sub>A</sub>''(''x'') and ''f''(''x'') is a compact interval [''a'', ''b''] of the real line. The maximum ''b'' and the minimum ''a'' are the largest and smallest eigenvalue of ''A'', respectively. The min-max theorem is a refinement of this fact.

=== Min-max theorem ===

Let <math display="inline">A</math> be Hermitian on an inner product space <math display="inline">V</math> with dimension <math display="inline">n</math>, with spectrum ordered in descending order <math display="inline">\lambda_1 \geq ... \geq \lambda_n</math>.

Let <math display="inline">v_1, ..., v_n</math> be the corresponding unit-length orthogonal eigenvectors.

Reverse the spectrum ordering, so that <math display="inline">\xi_1 = \lambda_n, ..., \xi_n = \lambda_1</math>.

{{Math theorem
| name = (Poincaré’s inequality)
| note = 
| math_statement = Let <math display="inline">M</math> be a subspace of <math display="inline">V</math> with dimension <math display="inline">k</math>, then there exists unit vectors <math display="inline">x, y\in M</math>, such that

<math display="inline">\langle x, Ax\rangle\leq \lambda_k</math>, and <math display="inline">\langle y, Ay\rangle \geq \xi_k</math>.
}}

{{Math proof|title=Proof|proof=

Part 2 is a corollary, using <math display="inline">-A</math>.

<math display="inline">M</math> is a <math display="inline">k</math> dimensional subspace, so if we pick any list of <math display="inline">n-k+1</math> vectors, their span <math display="inline">N := span(v_k, ... v_n)</math> must intersect <math display="inline">M</math> on at least a single line.

Take unit <math display="inline">x \in M\cap N</math>. That’s what we need.

: <math display="inline">x = \sum_{i=k}^n a_i v_i</math>, since <math display="inline">x\in N</math>.

: Since <math display="inline">\sum_{i=k}^n |a_i|^2 = 1</math>, we find <math display="inline">\langle x,Ax \rangle = \sum_{i=k}^n |a_i|^2\lambda_i \leq \lambda_k</math>.
}}

{{Math theorem
| name = min-max theorem
| note = 
| math_statement = <math display="block">\begin{aligned}
\lambda_k &=\max _{\begin{array}{c} \mathcal{M} \subset V \\ \operatorname{dim}(\mathcal{M})=k \end{array}} \min _{\begin{array}{c} x \in \mathcal{M} \\ \|x\|=1 \end{array}}\langle x, A x\rangle\\
&=\min _{\begin{array}{c} \mathcal{M} \subset V \\ \operatorname{dim}(\mathcal{M})=n-k+1 \end{array}} \max _{\begin{array}{c} x \in \mathcal{M} \\ \|x\|=1 \end{array}}\langle x, A x\rangle \text{. }
\end{aligned}</math>
}}

{{Math proof|title=Proof|proof=

Part 2 is a corollary of part 1, by using <math display="inline">-A</math>.

By Poincare’s inequality, <math display="inline">\lambda_k</math> is an upper bound to the right side.

By setting <math display="inline">\mathcal M = span(v_1, ... v_k)</math>, the upper bound is achieved.
}}

Define the [[partial trace]] <math display="inline">tr_V(A)</math> to be the trace of projection of <math display="inline">A</math> to <math display="inline">V</math>. It is equal to <math display="inline">\sum_i v_i^*Av_i</math> given an orthonormal basis of <math display="inline">V</math>.

{{Math theorem|name=Wielandt minimax formula|note=<ref name=":0">{{Cite book |last=Tao |first=Terence |title=Topics in random matrix theory |date=2012 |publisher=American Mathematical Society |isbn=978-0-8218-7430-1 |series=Graduate studies in mathematics |location=Providence, R.I}}</ref>{{Pg|page=44}}|math_statement=

Let <math display="inline">1 \leq i_1<\cdots<i_k \leq n</math> be integers. Define a partial flag to be a nested collection <math display="inline">V_1 \subset \cdots \subset V_k</math> of subspaces of <math display="inline">\mathbb{C}^n</math> such that <math display="inline">\operatorname{dim}\left(V_j\right)=i_j</math> for all <math display="inline">1 \leq j \leq k</math>.

Define the associated Schubert variety <math display="inline">X\left(V_1, \ldots, V_k\right)</math> to be the collection of all <math display="inline">k</math> dimensional subspaces <math display="inline">W</math> such that <math display="inline">\operatorname{dim}\left(W \cap V_j\right) \geq j</math>.

<math display="block">
      \lambda_{i_1}(A)+\cdots+\lambda_{i_k}(A)=\sup _{V_1, \ldots, V_k} \inf_{W \in X\left(V_1, \ldots, V_k\right)} tr_W(A)
      </math>
}}

{{hidden begin|style=width:100%|ta1=center|border=1px #aaa solid|title=Proof}}

{{Math proof|title=Proof|proof= 

The <math display="inline">\leq</math> case.

Let <math display="inline">V_{j} = span(e_1, \dots, e_{i_j})</math>, and any <math display="inline">W \in X\left(V_1, \ldots, V_k\right)</math>, it remains to show that <math display="block">
              \lambda_{i_1}(A)+\cdots+\lambda_{i_k}(A) \leq  tr_W(A)
              </math>

To show this, we construct an orthonormal set of vectors <math display="inline">v_1, \dots, v_k</math> such that <math display="inline">v_j \in V_j \cap W</math>. Then <math display="inline">tr_W(A) \geq \sum_j \langle v_j, Av_j\rangle \geq \lambda_{i_j}(A)</math>

Since <math display="inline">dim(V_1 \cap W) \geq 1</math>, we pick any unit <math display="inline">v_1 \in V_1 \cap W</math>. Next, since <math display="inline">dim(V_2 \cap W) \geq 2</math>, we pick any unit <math display="inline">v_2 \in (V_2 \cap W)</math> that is perpendicular to <math display="inline">v_1</math>, and so on.

The <math display="inline">\geq</math> case.

For any such sequence of subspaces <math display="inline">V_i</math>, we must find some <math display="inline">W \in X\left(V_1, \ldots, V_k\right)</math> such that <math display="block">\lambda_{i_1}(A)+\cdots+\lambda_{i_k}(A) \geq tr_W(A)
              </math>

Now we prove this by induction.

The <math display="inline">n=1</math> case is the Courant-Fischer theorem. Assume now <math display="inline">n \geq 2</math>.

If <math display="inline">i_1 \geq 2</math>, then we can apply induction. Let <math display="inline">E = span(e_{i_1}, \dots, e_n)</math>. We construct a partial flag within <math display="inline">E</math> from the intersection of <math display="inline">E</math> with <math display="inline">V_1, \dots, V_k</math>.

We begin by picking a <math display="inline">(i_k-(i_1-1))</math>-dimensional subspace <math display="inline">W_k' \subset E \cap V_{i_k}</math>, which exists by counting dimensions. This has codimension <math display="inline">(i_1-1)</math> within <math display="inline">V_{i_k}</math>.

Then we go down by one space, to pick a <math display="inline">(i_{k-1} - (i_1 - 1))</math>-dimensional subspace <math display="inline">W_{k-1}' \subset W_k \cap V_{i_{k-1}}</math>. This still exists. Etc. Now since <math display="inline">dim(E) \leq n-1</math>, apply the induction hypothesis, there exists some <math display="inline">W \in X(W_1, \dots, W_k)</math> such that <math display="block">\lambda_{i_1 - (i_1-1)}(A|E)+\cdots+\lambda_{i_k- (i_1-1)}(A|E) \geq tr_W(A)
              </math> Now <math display="inline">\lambda_{i_j - (i_1-1)}(A|E)</math> is the <math display="inline">(i_j-(i_1-1))</math>-th eigenvalue of <math display="inline">A</math> orthogonally projected down to <math display="inline">E</math>. By Cauchy interlacing theorem, <math display="inline">\lambda_{i_j - (i_1-1)}(A|E) \leq \lambda_{i_j}(A)</math>. Since <math display="inline">X(W_1, \dots, W_k)\subset X(V_1, \dots, V_k)</math>, we’re done.

If <math display="inline">i_1 = 1</math>, then we perform a similar construction. Let <math display="inline">E = span(e_{2}, \dots, e_n)</math>. If <math display="inline">V_k \subset E</math>, then we can induct. Otherwise, we construct a partial flag sequence <math display="inline">W_2, \dots, W_k</math> By induction, there exists some <math display="inline">W' \in X(W_2, \dots, W_k)\subset X(V_2, \dots, V_k)</math>, such that <math display="block">\lambda_{i_2-1}(A|E)+\cdots+\lambda_{i_k-1}(A|E) \geq tr_{W'}(A)</math> thus<br />
<math display="block">\lambda_{i_2}(A)+\cdots+\lambda_{i_k}(A) \geq tr_{W'}(A)</math> And it remains to find some <math display="inline">v</math> such that <math display="inline">W' \oplus v \in X(V_1, \dots, V_k)</math>.

If <math display="inline">V_1 \not\subset W'</math>, then any <math display="inline">v \in V_1 \setminus W'</math> would work. Otherwise, if <math display="inline">V_2 \not\subset W'</math>, then any <math display="inline">v \in V_2 \setminus W'</math> would work, and so on. If none of these work, then it means <math display="inline">V_k \subset E</math>, contradiction.
}}{{hidden end}}

This has some corollaries:<ref name=":0" />{{Pg|page=44}}
{{Math theorem|name=Extremal partial trace|note=|math_statement= 

<math display="block">\lambda_1(A)+\dots+\lambda_k(A)=\sup_{\operatorname{dim}(V)=k }tr_V(A)</math>

<math display="block">\xi_1(A)+\dots+\xi_k(A)=\inf_{\operatorname{dim}(V)=k }tr_V(A)</math>
}}

{{Math theorem|name=Corollary|note=|math_statement= 

The sum <math display="inline">\lambda_1(A)+\dots+\lambda_k(A)</math> is a convex function, and <math display="inline">\xi_1(A)+\dots+\xi_k(A)</math> is concave.

(Schur-Horn inequality) <math display="block">
      \xi_1(A)+\dots+\xi_k(A) \leq a_{i_1,i_1} + \dots + a_{i_k,i_k} \leq \lambda_1(A)+\dots+\lambda_k(A)
      </math> for any subset of indices.

Equivalently, this states that the diagonal vector of <math display="inline">A</math> is majorized by its eigenspectrum.
}}

{{Math theorem|name=Schatten-norm Hölder inequality|note=|math_statement=

Given Hermitian <math display="inline">A, B</math> and Hölder pair <math display="inline">1/p + 1/q = 1</math>, <math display="block">|\operatorname{tr}(A B)| \leq\|A\|_{S^p}\|B\|_{S^q}</math>
}}

{{hidden begin|style=width:100%|ta1=center|border=1px #aaa solid|title=Proof}}

{{Math proof|title=Proof|proof=

WLOG, <math display="inline">B</math> is diagonalized, then we need to show <math display="inline">
          |\sum_i B_{ii} A_{ii} | \leq \|A \|_{S^p} \|(B_{ii})\|_{l^q}
          </math> 

By the standard Hölder inequality, it suffices to show <math display="inline">\|(A_{ii})\|_{l^p}\leq \|A \|_{S^p}</math>

By the Schur-Horn inequality, the diagonals of <math display="inline">A</math> are majorized by the eigenspectrum of <math display="inline">A</math>, and since the map <math display="inline">f(x_1, \dots, x_n) = \|x\|_p</math> is symmetric and convex, it is Schur-convex.
 }}{{hidden end}}

=== Counterexample in the non-Hermitian case ===

Let ''N'' be the nilpotent matrix

:<math>\begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}.</math>

Define the Rayleigh quotient <math> R_N(x) </math> exactly as above in the Hermitian case. Then it is easy to see that the only eigenvalue of ''N'' is zero, while the maximum value of the Rayleigh quotient is {{math|{{sfrac|1|2}}}}. That is, the maximum value of the Rayleigh quotient is larger than the maximum eigenvalue.

== Applications ==

=== Min-max principle for singular values ===

The [[singular value]]s {''σ<sub>k</sub>''} of a square matrix ''M'' are the square roots of the eigenvalues of ''M''*''M'' (equivalently ''MM*''). An immediate consequence{{Citation needed|reason=claim is unreferenced and maybe suspicious|date=April 2014}} of the first equality in the min-max theorem is:

:<math>\sigma_k^{\downarrow} = \max_{S:\dim(S)=k} \min_{x \in S, \|x\| = 1} (M^* Mx, x)^{\frac{1}{2}}=\max_{S:\dim(S)=k} \min_{x \in S, \|x\| = 1} \| Mx \|.</math>

Similarly,

:<math>\sigma_k^{\downarrow} = \min_{S:\dim(S)=n-k+1} \max_{x \in S, \|x\| = 1} \| Mx \|.</math>

Here <math>\sigma_k^{\downarrow}</math> denotes the ''k''<sup>th</sup> entry in the decreasing sequence of the singular values, so that <math>\sigma_1^{\downarrow} \geq \sigma_2^{\downarrow} \geq \cdots </math>.

=== Cauchy interlacing theorem ===

{{Main|Poincaré separation theorem}}
Let {{mvar|A}} be a symmetric ''n'' × ''n'' matrix. The ''m'' × ''m'' matrix ''B'', where ''m'' ≤ ''n'', is called a '''[[compression (functional analysis)|compression]]''' of {{mvar|A}} if there exists an [[Projection (linear algebra)#Orthogonal projections|orthogonal projection]] ''P'' onto a subspace of dimension ''m'' such that ''PAP*'' = ''B''. The Cauchy interlacing theorem states:

:'''Theorem.''' If the eigenvalues of {{mvar|A}} are {{math|''α''<sub>1</sub> ≤ ... ≤ ''α<sub>n</sub>''}}, and those of ''B'' are {{math|''β''<sub>1</sub> ≤ ... ≤ ''β<sub>j</sub>'' ≤ ... ≤ ''β<sub>m</sub>''}}, then for all {{math|''j'' ≤ ''m''}},
::<math>\alpha_j \leq \beta_j \leq \alpha_{n-m+j}.</math>

This can be proven using the min-max principle. Let ''β<sub>i</sub>'' have corresponding eigenvector ''b<sub>i</sub>'' and ''S<sub>j</sub>'' be the ''j'' dimensional subspace {{math|''S<sub>j</sub>'' {{=}} span{''b''<sub>1</sub>, ..., ''b<sub>j</sub>''},}} then

:<math>\beta_j = \max_{x \in S_j, \|x\| = 1} (Bx, x) = \max_{x \in S_j, \|x\| = 1} (PAP^*x, x) \geq \min_{S_j} \max_{x \in 
S_j, \|x\| = 1} (A(P^*x), P^*x) = \alpha_j.</math>

According to first part of min-max, {{math|''α<sub>j</sub>'' ≤ ''β<sub>j</sub>''.}} On the other hand, if we define {{math|''S''<sub>''m''−''j''+1</sub> {{=}} span{''b<sub>j</sub>'', ..., ''b<sub>m</sub>''},}} then

:<math>\beta_j = \min_{x \in S_{m-j+1}, \|x\| = 1} (Bx, x) = \min_{x \in S_{m-j+1}, \|x\| = 1} (PAP^*x, x)= \min_{x \in S_{m-j+1}, \|x\| = 1} (A(P^*x), P^*x) \leq \alpha_{n-m+j},</math>

where the last inequality is given by the second part of min-max.

When {{math|''n'' − ''m'' {{=}} 1}}, we have {{math|''α<sub>j</sub>'' ≤ ''β<sub>j</sub>'' ≤ ''α''<sub>''j''+1</sub>}}, hence the name ''interlacing'' theorem.

=== Lidskii's inequality ===
{{Main|Trace class#Lidskii's theorem}}
{{Math theorem
| name = Lidskii inequality
| note = 
| math_statement = If <math display="inline">1 \leq i_1<\cdots<i_k \leq n</math> then <math display="block">\begin{aligned}
      & \lambda_{i_1}(A+B)+\cdots+\lambda_{i_k}(A+B) \\
      & \quad \leq \lambda_{i_1}(A)+\cdots+\lambda_{i_k}(A)+\lambda_1(B)+\cdots+\lambda_k(B)
      \end{aligned}</math>

<math display="block">\begin{aligned}
      & \lambda_{i_1}(A+B)+\cdots+\lambda_{i_k}(A+B) \\
      & \quad \geq \lambda_{i_1}(A)+\cdots+\lambda_{i_k}(A)+\xi_1(B)+\cdots+\xi_k(B)
      \end{aligned}</math>
}}
  
  {{hidden begin|style=width:100%|ta1=center|border=1px #aaa solid|title=Proof}}

{{Math proof|title=Proof|proof= 

The second is the negative of the first. The first is by Wielandt minimax.

<math display="block">\begin{aligned}
          & \lambda_{i_1}(A+B)+\cdots+\lambda_{i_k}(A+B) \\
          =& \sup_{V_1, \dots, V_k} \inf_{W\in X(V_1, \dots, V_k)}(tr_W(A) + tr_W(B)) \\
          =& \sup_{V_1, \dots, V_k} ( \inf_{W\in X(V_1, \dots, V_k)} tr_W(A) + tr_W(B)) \\ 
          \leq& \sup_{V_1, \dots, V_k} ( \inf_{W\in X(V_1, \dots, V_k)} tr_W(A) + (\lambda_1(B)+\cdots+\lambda_k(B))) \\ 
          =& \lambda_{i_1}(A)+\cdots+\lambda_{i_k}(A)+\lambda_1(B)+\cdots+\lambda_k(B)
          \end{aligned}</math>
}}{{hidden end}}

Note that <math>\sum_i \lambda_i(A+B) = tr(A+B) = \sum_i \lambda_i(A) + \lambda_i(B) </math>. In other words, <math>\lambda(A+B) - \lambda(A) \preceq \lambda(B)</math> where <math>\preceq</math> means [[majorization]]. By the Schur convexity theorem, we then have

{{Math theorem
| name = p-Wielandt-Hoffman inequality
| note = 
| math_statement = <math display="inline">\|\lambda(A+B) - \lambda(A)\|_{\ell^p} \leq \|B\|_{S^p}</math> where <math display="inline">\|\cdot\|_{S^p}</math> stands for the p-Schatten norm.
}}
== Compact operators ==

Let {{mvar|A}} be a [[Compact operator on Hilbert space|compact]], [[Hermitian]] operator on a Hilbert space ''H''. Recall that the [[spectrum (functional analysis)|spectrum]] of such an operator (the set of eigenvalues) is a set of real numbers whose only possible [[cluster point]] is zero. 
It is thus convenient to list the positive eigenvalues of {{mvar|A}} as

:<math>\cdots \le \lambda_k \le \cdots \le \lambda_1,</math>

where entries are repeated with [[Multiplicity (mathematics)|multiplicity]], as in the matrix case. (To emphasize that the sequence is decreasing, we may write <math>\lambda_k = \lambda_k^\downarrow</math>.) 
When ''H'' is infinite-dimensional, the above sequence of eigenvalues is necessarily infinite. 
We now apply the same reasoning as in the matrix case. Letting ''S<sub>k</sub>'' ⊂ ''H'' be a ''k'' dimensional subspace, we can obtain the following theorem.

:'''Theorem (Min-Max).''' Let {{mvar|A}} be a compact, self-adjoint operator on a Hilbert space {{mvar|H}}, whose positive eigenvalues are listed in decreasing order {{math|... ≤ ''λ<sub>k</sub>'' ≤ ... ≤ ''λ''<sub>1</sub>}}. Then:
::<math>\begin{align}
\max_{S_k} \min_{x \in S_k, \|x\| = 1} (Ax,x) &= \lambda_k ^{\downarrow}, \\
\min_{S_{k-1}} \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) &= \lambda_k^{\downarrow}.
\end{align}</math>

A similar pair of equalities hold for negative eigenvalues.

{{Math proof|drop=hidden|proof=
Let ''S' '' be the closure of the linear span <math>S' =\operatorname{span}\{u_k,u_{k+1},\ldots\}</math>.
The subspace ''S' '' has codimension ''k'' − 1. By the same dimension count argument as in the matrix case, ''S' '' ∩ ''S<sub>k</sub>'' has positive dimension. So there exists ''x'' ∈ ''S'&nbsp;'' ∩ ''S<sub>k</sub>'' with <math>\|x\|=1</math>. Since it is an element of ''S' '', such an ''x'' necessarily satisfy

:<math>(Ax, x) \le \lambda_k.</math>

Therefore, for all ''S<sub>k</sub>''

:<math>\inf_{x \in S_k, \|x\| = 1}(Ax,x) \le \lambda_k</math>

But {{mvar|A}} is compact, therefore the function ''f''(''x'') = (''Ax'', ''x'') is weakly continuous. Furthermore, any bounded set in ''H'' is weakly compact. This lets us replace the infimum by minimum:

:<math>\min_{x \in S_k, \|x\| = 1}(Ax,x) \le \lambda_k.</math>

So

:<math>\sup_{S_k} \min_{x \in S_k, \|x\| = 1}(Ax,x) \le \lambda_k.</math>
 
Because equality is achieved when <math>S_k=\operatorname{span}\{u_1,\ldots,u_k\}</math>,

:<math>\max_{S_k} \min_{x \in S_k, \|x\| = 1}(Ax,x) = \lambda_k.</math>

This is the first part of min-max theorem for compact self-adjoint operators.

Analogously, consider now a {{math|(''k'' − 1)}}-dimensional subspace ''S''<sub>''k''−1</sub>, whose the orthogonal complement is denoted by ''S''<sub>''k''−1</sub><sup>&perp;</sup>. If ''S' '' =&nbsp;span{''u''<sub>1</sub>...''u<sub>k</sub>''},

:<math>S' \cap S_{k-1}^{\perp} \ne {0}.</math>

So

:<math>\exists x \in S_{k-1}^{\perp} \, \|x\| = 1, (Ax, x) \ge \lambda_k.</math>

This implies

:<math>\max_{x \in S_{k-1}^{\perp}, \|x\| = 1} (Ax, x) \ge \lambda_k</math>

where the compactness of ''A'' was applied. Index the above by the collection of ''k-1''-dimensional subspaces gives

:<math>\inf_{S_{k-1}} \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) \ge \lambda_k.</math>

Pick ''S''<sub>''k''−1</sub> = span{''u''<sub>1</sub>, ..., ''u''<sub>''k''−1</sub>} and we deduce

:<math>\min_{S_{k-1}} \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) = \lambda_k.</math>
}}

== Self-adjoint operators ==

The min-max theorem also applies to (possibly unbounded) self-adjoint operators.<ref name="teschl">G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf</ref><ref name="lieb-loss">{{cite book |last1=Lieb |last2=Loss |title=Analysis |edition=2nd |series=GSM |volume=14 |location=Providence |publisher=American Mathematical Society |year=2001 |isbn=0-8218-2783-9 }}</ref> Recall the [[essential spectrum]] is the spectrum without isolated eigenvalues of finite multiplicity. 
Sometimes we have some eigenvalues below the essential spectrum, and we would like to approximate the eigenvalues and eigenfunctions.

:'''Theorem (Min-Max).''' Let ''A'' be self-adjoint, and let <math>E_1\le E_2\le E_3\le\cdots</math> be the eigenvalues of ''A'' below the essential spectrum. Then

<math>E_n=\min_{\psi_1,\ldots,\psi_{n}}\max\{\langle\psi,A\psi\rangle:\psi\in\operatorname{span}(\psi_1,\ldots,\psi_{n}), \, \| \psi \| = 1\}</math>.

If we only have ''N'' eigenvalues and hence run out of eigenvalues, then we let <math>E_n:=\inf\sigma_{ess}(A)</math> (the bottom of the essential spectrum) for ''n>N'', and the above statement holds after replacing min-max with inf-sup.

:'''Theorem (Max-Min).''' Let ''A'' be self-adjoint, and let <math>E_1\le E_2\le E_3\le\cdots</math> be the eigenvalues of ''A'' below the essential spectrum. Then

<math>E_n=\max_{\psi_1,\ldots,\psi_{n-1}}\min\{\langle\psi,A\psi\rangle:\psi\perp\psi_1,\ldots,\psi_{n-1}, \, \| \psi \| = 1\}</math>.

If we only have ''N'' eigenvalues and hence run out of eigenvalues, then we let <math>E_n:=\inf\sigma_{ess}(A)</math> (the bottom of the essential spectrum) for ''n > N'', and the above statement holds after replacing max-min with sup-inf.

The proofs<ref name="teschl"/><ref name="lieb-loss"/> use the following results about self-adjoint operators:

:'''Theorem.''' Let ''A'' be self-adjoint. Then <math>(A-E)\ge0</math> for <math>E\in\mathbb{R}</math> if and only if <math>\sigma(A)\subseteq[E,\infty)</math>.<ref name="teschl"/>{{rp|77}}

:'''Theorem.''' If ''A'' is self-adjoint, then

<math>\inf\sigma(A)=\inf_{\psi\in\mathfrak{D}(A),\|\psi\|=1}\langle\psi,A\psi\rangle</math>

and

<math>\sup\sigma(A)=\sup_{\psi\in\mathfrak{D}(A),\|\psi\|=1}\langle\psi,A\psi\rangle</math>.<ref name="teschl"/>{{rp|77}}

== See also ==

* [[Courant minimax principle]]
* [[Max–min inequality]]

==References==

{{Reflist}}

==External links and citations to related work==

* {{cite arXiv| last1=Fisk|first1=Steve|title=A very short proof of Cauchy's interlace theorem for eigenvalues of Hermitian matrices |date=2005|eprint=math/0502408 }}
* {{cite journal|last1=Hwang|first1=Suk-Geun|title=Cauchy's Interlace Theorem for Eigenvalues of Hermitian Matrices |journal=The American Mathematical Monthly | volume=111| date=2004|issue=2 | pages=157–159|doi=10.2307/4145217|jstor=4145217 | url=https://www.jstor.org/stable/4145217}}
* {{cite journal|last1=Kline|first1=Jeffery|title=Bordered Hermitian matrices and sums of the Möbius function|journal=Linear Algebra and Its Applications |volume=588| date=2020| pages=224–237|doi=10.1016/j.laa.2019.12.004|doi-access=free}}
* {{cite book|last1=Reed |first1=Michael |last2=Simon |first2=Barry | title=Methods of Modern Mathematical Physics IV: Analysis of Operators |year=1978 |isbn= 978-0-08-057045-7 | publisher=Academic Press| url=https://www.elsevier.com/books/iv-analysis-of-operators/reed/978-0-08-057045-7}}

{{Functional analysis}}
{{Analysis in topological vector spaces}}
{{Spectral theory}}

[[Category:Articles containing proofs]]
[[Category:Operator theory]]
[[Category:Spectral theory]]
[[Category:Theorems in functional analysis]]