Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Chernoff bound
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Matrix Chernoff bound == {{main|Matrix Chernoff bound}} [[Rudolf Ahlswede]] and [[Andreas Winter]] introduced a Chernoff bound for matrix-valued random variables.<ref>{{cite journal |last1=Ahlswede |first1=R. |last2=Winter |first2=A. |year=2003 |title=Strong Converse for Identification via Quantum Channels |volume=48 |issue=3 |pages=569β579 |journal=[[IEEE Transactions on Information Theory]] |arxiv=quant-ph/0012127 |doi=10.1109/18.985947 |s2cid=523176 }}</ref> The following version of the inequality can be found in the work of Tropp.<ref>{{cite journal |last1=Tropp |first1=J. |year=2010 |title=User-friendly tail bounds for sums of random matrices |arxiv=1004.4389 |doi=10.1007/s10208-011-9099-z |volume=12 |issue=4 |journal=Foundations of Computational Mathematics |pages=389β434 |s2cid=17735965 }}</ref> Let {{math|''M''<sub>1</sub>, ..., ''M<sub>t</sub>''}} be independent matrix valued random variables such that <math> M_i\in \mathbb{C}^{d_1 \times d_2} </math> and <math> \mathbb{E}[M_i]=0</math>. Let us denote by <math> \lVert M \rVert </math> the operator norm of the matrix <math> M </math>. If <math> \lVert M_i \rVert \leq \gamma </math> holds almost surely for all <math> i\in\{1,\ldots, t\} </math>, then for every {{math|''Ξ΅'' > 0}} :<math>\Pr\left( \left\| \frac{1}{t} \sum_{i=1}^t M_i \right\| > \varepsilon \right) \leq (d_1+d_2) \exp \left( -\frac{3\varepsilon^2 t}{8\gamma^2} \right).</math> Notice that in order to conclude that the deviation from 0 is bounded by {{math|''Ξ΅''}} with high probability, we need to choose a number of samples <math>t </math> proportional to the logarithm of <math> d_1+d_2 </math>. In general, unfortunately, a dependence on <math> \log(\min(d_1,d_2)) </math> is inevitable: take for example a diagonal random sign matrix of dimension <math>d\times d </math>. The operator norm of the sum of ''t'' independent samples is precisely the maximum deviation among ''d'' independent random walks of length ''t''. In order to achieve a fixed bound on the maximum deviation with constant probability, it is easy to see that ''t'' should grow logarithmically with ''d'' in this scenario.<ref>{{cite arXiv |last1=Magen |first1=A.|author1-link=Avner Magen |last2=Zouzias |first2=A. |year=2011 |title=Low Rank Matrix-Valued Chernoff Bounds and Approximate Matrix Multiplication |class=cs.DM |eprint=1005.2724 }}</ref> The following theorem can be obtained by assuming ''M'' has low rank, in order to avoid the dependency on the dimensions. ===Theorem without the dependency on the dimensions=== Let {{math|0 < ''Ξ΅'' < 1}} and ''M'' be a random symmetric real matrix with <math>\| \operatorname E[M] \| \leq 1 </math> and <math>\| M\| \leq \gamma </math> almost surely. Assume that each element on the support of ''M'' has at most rank ''r''. Set :<math> t = \Omega \left( \frac{\gamma\log (\gamma/\varepsilon^2)}{\varepsilon^2} \right).</math> If <math> r \leq t </math> holds almost surely, then :<math>\Pr\left(\left\| \frac{1}{t} \sum_{i=1}^t M_i - \operatorname E[M] \right\| > \varepsilon \right) \leq \frac{1}{\mathbf{poly}(t)}</math> where {{math|''M''<sub>1</sub>, ..., ''M<sub>t</sub>''}} are i.i.d. copies of ''M''.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)