Editing Communication complexity (section)

== Randomized communication complexity ==
In the above definition, we are concerned with the number of bits that must be ''deterministically'' transmitted between two parties. If both the parties are given access to a random number generator, can they determine the value of <math>f</math> with much less information exchanged? Yao, in his seminal paper<ref name=yao1979/>
answers this question by defining randomized communication complexity.

A randomized protocol <math>R</math> for a function <math>f</math> has two-sided error.

:<math>
\Pr[R(x,y) = 0] > \frac{2}{3}, \textrm{if }\, f(x,y) = 0
</math>

:<math>
\Pr[R(x,y) = 1] > \frac{2}{3}, \textrm{if }\, f(x,y) = 1
</math>

A randomized protocol is a deterministic protocol that uses an extra random string in addition to its normal input. There are two models for this: a ''public string'' is a random string that is known by both parties beforehand, while a ''private string'' is generated by one party and must be communicated to the other party. A theorem presented below shows that any public string protocol can be simulated by a private string protocol that uses ''O(log n)'' additional bits compared to the original.

In the probability inequalities above, the outcome of the protocol is understood to depend ''only'' on the random string; both strings ''x'' and ''y'' remain fixed. In other words, if ''R''(''x'',''y'') yields ''g''(''x'',''y'',''r'') when using random string ''r'', then ''g''(''x'',''y'',''r'') = ''f''(''x'',''y'') for at least 2/3 of all choices for the string ''r''.

The randomized complexity is simply defined as the number of bits exchanged in such a protocol.

Note that it is also possible to define a randomized protocol with one-sided error, and the complexity is defined similarly.

=== Example: EQ ===

Returning to the previous example of ''EQ'', if certainty is not required, Alice and Bob can check for equality using only {{tmath|O(\log n)}} messages.  Consider the following protocol:  Assume that Alice and Bob both have access to the same random string <math>z \in \{0,1\}^n</math>. Alice computes <math>z \cdot x</math> and sends this bit (call it ''b'') to Bob. (The <math>(\cdot)</math> is the [[dot product]] in [[finite field#Some small finite fields|GF(2)]].) Then Bob compares ''b'' to <math>z \cdot y</math>. If they are the same, then Bob accepts, saying ''x'' equals ''y''. Otherwise, he rejects.

Clearly, if <math>x = y</math>, then <math>z \cdot x = z \cdot y</math>, so <math>Prob_z[Accept] = 1</math>. If ''x'' does not equal ''y'', it is still possible that <math>z \cdot x = z \cdot y</math>, which would give Bob the wrong answer. How does this happen?

If ''x'' and ''y'' are not equal, they must differ in some locations:

:<math>\begin{cases}
x = c_1 c_2 \ldots p   \ldots p'  \ldots x_n \\
y = c_1 c_2 \ldots q   \ldots q'  \ldots y_n \\
z = z_1 z_2 \ldots z_i \ldots z_j \ldots z_n 
\end{cases}</math>

Where {{mvar|x}} and {{mvar|y}} agree, <math>z_i * x_i = z_i * c_i = z_i * y_i</math> so those terms affect the dot products equally. We can safely ignore those terms and look only at where {{mvar|x}} and {{mvar|y}} differ. Furthermore, we can swap the bits <math>x_i</math> and <math>y_i</math> without changing whether or not the dot products are equal. This means we can swap bits so that {{mvar|x}} contains only zeros and {{mvar|y}} contains only ones:

:<math>\begin{cases}
x' = 0   0   \ldots 0   \\
y' = 1   1   \ldots 1   \\
z' = z_1 z_2 \ldots z_{n'} 
\end{cases}</math>

Note that <math>z' \cdot x' = 0</math> and <math>z' \cdot y' = \Sigma_i z'_i</math>. Now, the question becomes: for some random string <math>z'</math>, what is the probability that <math>\Sigma_i z'_i = 0</math>?  Since each <math>z'_i</math> is equally likely to be {{val|0}} or {{val|1}}, this probability is just <math>1/2</math>. Thus, when {{mvar|x}} does not equal {{mvar|y}},
<math>Prob_z[Accept] = 1/2</math>. The algorithm can be repeated many times to increase its accuracy. This fits the requirements for a randomized communication algorithm.

This shows that ''if Alice and Bob share a random string of length n'', they can send one bit to each other to compute <math>EQ(x,y)</math>. In the next section, it is shown that Alice and Bob can exchange only {{tmath|O(\log n)}} bits that are as good as sharing a random string of length ''n''. Once that is shown, it follows that ''EQ'' can be computed in {{tmath|O(\log n)}} messages.

=== Example: GH ===
For yet another example of randomized communication complexity, we turn to an example known as the ''[[gap-Hamming problem]]'' (abbreviated ''GH''). Formally, Alice and Bob both maintain binary messages, <math>x,y \in \{-1, +1\}^n</math> and would like to determine if the strings are very similar or if they are not very similar. In particular, they would like to find a communication protocol requiring the transmission of as few bits as possible to compute the following partial Boolean function, 

:<math>
\text{GH}_n(x, y) := 
\begin{cases}
-1 & \langle x, y \rangle \leq \sqrt{n} \\
+1 & \langle x, y \rangle \geq \sqrt{n}. 
\end{cases} 
</math> 

Clearly, they must communicate all their bits if the protocol is to be deterministic (this is because, if there is a deterministic, strict subset of indices that Alice and Bob relay to one another, then imagine having a pair of strings that on that set disagree in <math>\sqrt{n} - 1</math> positions. If another disagreement occurs in any position that is not relayed, then this affects the result of <math> \text{GH}_n(x, y)</math>, and hence would result in an incorrect procedure. 

A natural question one then asks is, if we're permitted to err <math>1/3</math> of the time (over random instances <math> x, y</math> drawn uniformly at random from <math> \{-1, +1\}^n </math>), then can we get away with a protocol with fewer bits? It turns out that the answer somewhat surprisingly is no, due to a result of Chakrabarti and Regev in 2012: they show that for random instances, any procedure which is correct at least <math>2/3</math> of the time must send <math>\Omega(n)</math> bits worth of communication, which is to say essentially all of them.

=== Public coins versus private coins ===

Creating random protocols becomes easier when both parties have access to the same random string, known as a shared string protocol. However, even in cases where the two parties do not share a random string, it is still possible to use private string protocols with only a small communication cost.  Any shared string random protocol using any number of random string can be simulated by a private string protocol that uses an extra ''O(log n)'' bits.

Intuitively, we can find some set of strings that has enough randomness in it to run the random protocol with only a small increase in error.  This set can be shared beforehand, and instead of drawing a random string, Alice and Bob need only agree on which string to choose from the shared set. This set is small enough that the choice can be communicated efficiently. A formal proof follows.

Consider some random protocol ''P'' with a maximum error rate of 0.1. Let <math>R</math> be <math>100n</math> strings of length ''n'', numbered <math>r_1, r_2, \dots, r_{100n}</math>. Given such an <math>R</math>, define a new protocol <math>P'_R</math> which randomly picks some <math>r_i</math> and then runs ''P'' using <math>r_i</math> as the shared random string. It takes ''O''(log&nbsp;100''n'') = ''O''(log&nbsp;''n'') bits to communicate the choice of <math>r_i</math>.

Let us define <math>p(x,y)</math> and <math>p'_R(x,y)</math> to be the probabilities that <math>P</math> and <math>P'_R</math>  compute the correct value for the input <math>(x,y)</math>.

For a fixed <math>(x,y)</math>, we can use [[Hoeffding's inequality]] to get the following equation:

:<math>\Pr_R[|p'_R(x,y) - p(x,y)| \geq 0.1] \leq 2 \exp(-2(0.1)^2 \cdot 100n) < 2^{-2n}</math>

Thus when we don't have <math>(x,y)</math> fixed:

:<math>\Pr_R[\exists (x,y):\ |p'_R(x,y) - p(x,y)| \geq 0.1] \leq \sum_{(x,y)} \Pr_R[|p'_R(x,y) - p(x,y)| \geq 0.1] < \sum_{(x,y)} 2^{-2n} = 1</math>

The last equality above holds because there are <math>2^{2n}</math> different pairs <math>(x,y)</math>. Since the probability does not equal 1, there is some <math>R_0</math> so that for all <math>(x,y)</math>:

:<math>|p'_{R_0}(x,y) - p(x,y)| < 0.1</math>

Since <math>P</math> has at most 0.1 error probability, <math>P'_{R_0}</math> can have at most 0.2 error probability.

=== Collapse of Randomized Communication Complexity ===

Let's say we additionally allow Alice and Bob to share some resource, for example a pair of entangled particles. Using that ressource, Alice and Bob can correlate their information and thus try to 'collapse' (or 'trivialize') communication complexity in the following sense.

'''Definition.''' ''A resource <math>R</math> is said to be ''"collapsing"'' if, using that resource <math>R</math>, only one bit of classical communication is enough for Alice to know the evaluation <math>f(x,y)</math> in the worst case scenario for any [[Boolean function]] <math>f</math>. ''

The surprising fact of a collapse of communication complexity is that the function <math>f</math> can have arbitrarily large entry size, but still the number of communication bit is constant to a single one.

Some resources are shown to be non-collapsing, such as quantum correlations <ref>{{cite book | chapter-url=https://doi.org/10.1007/3-540-49208-9_4 | doi=10.1007/3-540-49208-9_4 | chapter=Quantum Entanglement and the Communication Complexity of the Inner Product Function | title=Quantum Computing and Quantum Communications | series=Lecture Notes in Computer Science | date=1999 | last1=Cleve | first1=Richard | last2=Van Dam | first2=Wim | last3=Nielsen | first3=Michael | last4=Tapp | first4=Alain | volume=1509 | pages=61–74 | isbn=978-3-540-65514-5 | url=https://digital.library.unt.edu/ark:/67531/metadc706249/ }}</ref> or more generally almost-quantum correlations,<ref>{{cite journal | url=https://doi.org/10.1038/ncomms7288 | doi=10.1038/ncomms7288 | title=Almost quantum correlations | date=2015 | last1=Navascués | first1=Miguel | last2=Guryanova | first2=Yelena | last3=Hoban | first3=Matty J. | last4=Acín | first4=Antonio | journal=Nature Communications | volume=6 | page=6288 | pmid=25697645 | arxiv=1403.4621 | bibcode=2015NatCo...6.6288N }}</ref> whereas on the contrary some other resources are shown to collapse randomized communication complexity, such as the PR-box,<ref>W. van Dam, Nonlocality & Communication Complexity,
Ph.d. thesis, University of Oxford (1999).</ref> or some noisy PR-boxes satisfying some conditions.<ref>{{cite journal |last1=Brassard |first1=Gilles |last2=Buhrman |first2=Harry |last3=Linden |first3=Noah |last4=Méthot |first4=André Allan |last5=Tapp |first5=Alain |last6=Unger |first6=Falk |title=Limit on Nonlocality in Any World in Which Communication Complexity Is Not Trivial |journal=Physical Review Letters |date=27 June 2006 |volume=96 |issue=25 |doi=10.1103/PhysRevLett.96.250401|arxiv=quant-ph/0508042 }}</ref><ref>{{cite journal |last1=Brunner |first1=Nicolas |last2=Skrzypczyk |first2=Paul |title=Nonlocality Distillation and Postquantum Theories with Trivial Communication Complexity |journal=Physical Review Letters |date=24 April 2009 |volume=102 |issue=16 |doi=10.1103/PhysRevLett.102.160403|arxiv=0901.4070 }}</ref><ref>{{cite journal |last1=Botteron |first1=Pierre |last2=Broadbent |first2=Anne |last3=Proulx |first3=Marc-Olivier |title=Extending the Known Region of Nonlocal Boxes that Collapse Communication Complexity |journal=Physical Review Letters |date=14 February 2024 |volume=132 |issue=7 |doi=10.1103/PhysRevLett.132.070201|arxiv=2302.00488 }}</ref>

=== Distributional Complexity ===

One approach to studying randomized communication complexity is through distributional complexity.

Given a joint distribution <math>\mu</math> on the inputs of both players, the corresponding distributional complexity of a function <math>f</math> is the minimum cost of a ''deterministic'' protocol <math>R</math> such that <math>\Pr[f(x,y) = R(x,y)] \ge 2/3</math>, where the inputs are sampled according to <math>\mu</math>.

Yao's minimax principle<ref>{{cite conference |url=https://ieeexplore.ieee.org/document/4567946 |title=Probabilistic computations: Toward a unified measure of complexity |last=Yao |first=Andrew Chi-Chih |author-link=Andrew Yao |date=1977 |publisher=IEEE |book-title=18th Annual Symposium on Foundations of Computer Science (sfcs 1977) |issn=0272-5428 |doi=10.1109/SFCS.1977.24}}</ref> (a special case of [[John von Neumann|von Neumann]]'s [[minimax theorem]]) states that the randomized communication complexity of a function equals its maximum distributional complexity, where the maximum is taken over all joint distributions of the inputs (not necessarily product distributions!).

Yao's principle can be used to prove lower bounds on the randomized communication complexity of a function: design the appropriate joint distribution, and prove a lower bound on the distributional complexity. Since distributional complexity concerns deterministic protocols, this could be easier than proving a lower bound on randomized protocols directly.

As an example, let us consider the ''disjointness'' function DISJ: each of the inputs is interpreted as a subset of <math>\{1,\dots,n\}</math>, and DISJ({{mvar|x}},{{mvar|y}})=1 if the two sets are disjoint. Razborov<ref>{{cite journal |last=Razborov |first=Alexander |author-link=Alexander Razborov |date=1992 |title=On the distributional complexity of disjointness |journal=Theoretical Computer Science |volume=106 |issue=2 |pages=385–390 |doi=10.1016/0304-3975(92)90260-M |doi-access=free }}</ref> proved an <math>\Omega(n)</math> lower bound on the randomized communication complexity by considering the following distribution: with probability 3/4, sample two random disjoint sets of size <math>n/4</math>, and with probability 1/4, sample two random sets of size <math>n/4</math> with a unique intersection.

=== Information Complexity ===

A powerful approach to the study of distributional complexity is information complexity. Initiated by Bar-Yossef, Jayram, Kumar and Sivakumar,<ref>{{cite journal |last1=Bar-Yossef |first1=Ziv  |last2=Jayram |first2=T. S. |last3=Kumar |first3=Ravi |last4=Sivakumar |first4=D. |date=2004 |title=An information statistics approach to data stream and communication complexity |url=https://people.seas.harvard.edu/~madhusudan/courses/Spring2016/papers/BJKS.pdf |journal=Journal of Computer and System Sciences |volume=68 |issue=4 |pages=702–732 |doi=10.1016/j.jcss.2003.11.006 |access-date=1 December 2023}}</ref> the approach was codified in work of Barak, Braverman, Chen and Rao<ref>{{cite journal |last1=Barak |first1=Boaz |author-link1=Boaz Barak |last2=Braverman |first2=Mark |author-link2=Mark Braverman |last3=Chen |first3=Xi |last4=Rao |first4=Anup |date=2013 |title=How to Compress Interactive Communication |url=https://www.boazbarak.org/Papers/directsum.pdf |journal=SIAM Journal on Computing |volume=42 |issue=3 |pages=1327–1363 |doi=10.1137/100811969}}</ref> and by Braverman and Rao.<ref>{{cite journal |last1=Braverman |first1=Mark |author-link1=Mark Braverman |last2=Rao |first2=Anup |title=Information equals amortized communication |year=2014 |journal=IEEE Transactions on Information Theory |volume=60 |issue=10 |pages=6058–6069 |doi=10.1109/TIT.2014.2347282|arxiv=1106.3595 }}</ref>

The (internal) information complexity of a (possibly randomized) protocol {{mvar|R}} with respect to a distribution {{mvar|μ}} is defined as follows. Let <math>(X,Y) \sim \mu</math> be random inputs sampled according to {{mvar|μ}}, and let {{mvar|Π}} be the transcript of {{mvar|R}} when run on the inputs <math>X,Y</math>. The information complexity of the protocol is 

:<math>
\operatorname{IC}_\mu(R) = I(\Pi;Y|X) + I(\Pi;X|Y),
</math>

where {{mvar|I}} denotes [[Information_theory#Mutual_information_(transinformation)|conditional mutual information]].
The first summand measures the amount of information that Alice learns about Bob's input from the transcript, and the second measures the amount of information that Bob learns about Alice's input.

The {{mvar|ε}}-error information complexity of a function {{mvar|f}} with respect to a distribution {{mvar|μ}} is the infimal information complexity of a protocol for {{mvar|f}} whose error (with respect to {{mvar|μ}}) is at most {{mvar|ε}}.

Braverman and Rao proved that information equals amortized communication. This means that the cost for solving {{mvar|n}} independent copies of {{mvar|f}} is roughly {{mvar|n}} times the information complexity of {{mvar|f}}. This is analogous to the well-known interpretation of [[Entropy_(information_theory)|Shannon entropy]] as the amortized bit-length required to transmit data from a given information source. Braverman and Rao's proof uses a technique known as "protocol compression", in which an information-efficient protocol is "compressed" into a communication-efficient protocol.

The techniques of information complexity enable the computation of the exact (up to first order) communication complexity of set disjointness to be <math>1.4923\ldots n</math>.<ref>{{cite conference |url= |title= |last1=Braverman |first1=Mark |author-link1=Mark Braverman |last2=Garg |first2=Ankit |last3=Pankratov |first3=Denis |last4=Weinstein |first4=Omri |date=June 2013 |publisher=ACM |book-title=STOC '13: Proceedings of the forty-fifth annual ACM symposium on Theory of Computing |pages=151–160 |location=Palo Alto, CA |doi=10.1145/2488608.2488628|isbn= 978-1-4503-2029-0}}</ref>

Information complexity techniques have also been used to analyze extended formulations, proving an essentially optimal lower bound on the complexity of algorithms based on [[linear programming]] which approximately solve the [[clique problem|maximum clique problem]].<ref>{{cite conference |url=https://eccc.weizmann.ac.il/report/2012/131/ |title=An information complexity approach to extended formulations |last1=Braverman |first1=Mark |author-link1=Mark Braverman (mathematician) |last2=Moitra |first2=Ankur |date=1 June 2013 |publisher=ACM |book-title=STOC '13: Proceedings of the forty-fifth annual ACM symposium on Theory of Computing |pages=161–170 |location=Palo Alto, CA |doi=10.1145/2488608.2488629}}</ref>

Omri Weinstein's 2015 survey<ref>{{cite journal |last=Weinstein |first=Omri |date=June 2015 |title=Information Complexity and the Quest for Interactive Compression |url=https://eccc.weizmann.ac.il/report/2015/060/ |journal=ACM SIGACT News |volume=46 |issue=2 |pages=41–64 |doi=10.1145/2789149.2789161 |access-date=1 December 2023}}</ref> surveys the subject.