Adapted from “The story of set disjointness” by Arkadev Chattopadhyay and Toni Pitassi.
An lower bound on the randomized communication complexity of set disjointness.
Theorem
Let be the set disjointness function: that is, given any sets , returns true if and are disjoint: . Then the randomized communication complexity of is .
More precisely, there is some constant probability of failure such that any randomized communication protocol that succeeds with probability on each input must use bits of communication.
Proof
Plan
By Algorithmic minimax principle, it is enough to show a distribution over inputs on which any deterministic protocol needs communication in order to succed with probability on average. Here, we will consider the (product) distribution where both and are chosen uniformly at random in .
We’ll show that the corresponding communication matrix has no large “nearly -monochromatic” rectangles. That is, for any subsets , either
- or only represents a fraction of ;
- or a fraction of the pairs intersect.
Intuitively, we want to show that if and are large, then they must both be pretty equally spread over , and if that’s the case, they must intersect a lot. However, there doesn’t seem to be a clean notion of “equally spread” that you can prove independently for and , so instead we’ll:
- assume that and is large, and and don’t intersect much;
- find a small number of sets in that cover much of and yet don’t intersect too much;
- show that this limits the size of .
Execution
Suppose that , and that only an fraction of the pairs intersect. Some sets might be unusual in that they intersect with much more than an fraction of the sets in , but at least half of them intersect with at most a fraction of . Let be the family of such good sets; we have .
Then, find sets such that . This is always possible given how big is: Just choose those sets greedily to increase the size of the union as quickly as possible. If at some point and you can’t find some that extends the size of the union by at least , that would show that there are at most
elements left in .
Now (using the same trick as earlier), the average set in intersects only a fraction of the sets , so at least half of them intersect at most of them. Let be the family of such good sets; we have .
Each set can only intersect the union at points, and the rest of must lie entirely in the remaining elements of . Therefore,
We can upper bound by , and by choosing small enough, we can upper bound by for any we want, so we get
as desired.