Switching lemma
The proof is adapted from Hamed Hatami’s lecture notes for “COMP 531: Advanced Complexity Theory” at McGill. Hamed Hatami doesn’t condone my crappy metaphors.
This note presents Håstad’s original proof of his switching lemma.
Hermits and terrorists
You’re at a board game night, playing a very disturbing version of one-night werewolf. You don’t know how many players there are, but you know that tonight:
- some of the villagers are hermits, who will find their vocation and leave the village to go live in isolation;
- those who are not hermits have probability
of being a terrorist, who will blow themselves up and kill everyone; - the rest are normal villagers, who are fast asleep.
What’s the chance that at least
First of all, we can just disregard all the hermits. We don’t know how many non-hermits there are, but if at least
Here, crucially, we assumed they all independently had a
ANDs and ORs
The politically incorrect situation above is analogous to what happens to DNFs under random restrictions. Take some term
- If any of the variables in
get set to , then evaluates to and disappears from the DNF without affecting the other terms (i.e. is a hermit).1 - On the other hand, assuming
doesn’t get set to (all its variables were set to or ), there is a chance that all of the variables in get set to , in which case evaluates to and the whole DNF collapses to the constant (i.e. is a terrorist).
As long as
But the terms of a DNF tend to share a lot of variables,2 so the events that two terms
To rephrase, what goes wrong is: The terms shared some stars, which allowed them to both not be terrorists without it being a huge surprise.
It turns out this is the only issue! So if somehow after looking at
This insight then allows us to show that
What we lost: a union bound
However, we did lose something: we now have to union-bound over all the possible assignments of those
And this union bound over the
where each
Nevertheless, this case is not actually problematic for say
Statement and proof
Let
(for some constant
We want to show that by induction: looking at the first non-hermit term
What we’ll actually prove is that for any auxiliary function
This condition “
We’ll induce on
We can split the event
Case 1: is a hermit
Nothing interesting happens here: we’re just showing that “conditioning on the first few terms being hermits” can be achieved by “conditioning on some auxiliary function
We want to bound
- Once we’ve conditioned on
collapsing to , then is equivalent , and so their depths are the same. - Conditioning on
and both collapsing to is the same as conditioning on their OR collapsing to .
So we can rewrite
and then the induction hypothesis for
Case 2: is not a hermit
This is the interesting case.
Without loss of generality (up to a flipping of the variables), we can assume that
- the probability to get
stars in decreases fast as grows; - the probability that “the rest of
after fixing those stars” has depth is small.
First, split into cases based on the subset
(note that we’re using shortcuts like “
Part (a): is unlikely to have many stars
First, we deal with the sum
We want to somehow argue that conditioning on
We’re comparing two situations. In both of them, we’re forcing
Therefore, we obtain6
Part (b): the rest is unlikely to be deep
To bound the probability
- If
, then gets set to , so , so and we can just drop that case. - Otherwise,
gets set to , so . This will allow us to induce on , which has less terms than .
But in order to apply the inductive hypothesis, we also need to somehow get rid of the conditioning on
- conditioned on
, holds iff it holds under every possible fixing of the variables in : that is, can be replaced by ;7 - to make sure that
, it suffices to add to .
All in all, “
doesn’t depend on the variables in
This gives us
where the second equality holds because neither
Putting (a) and (b) together
Coming back to our sum, we have
It then suffices to set
-
When
is at least somewhat large, this is by far the most common outcome: the chance that no variable in gets set to is only . But Håstad’s switching lemma doesn’t use this fact at all! ↩ -
If they don’t (i.e. if
is read-once), then the above reasoning is completely valid, though! ↩ -
To make this complete, we also have to argue that
won’t be too big, but this is the case. In fact, the probability that has stars decreases as , because we’re counterbalancing a union bound over the choice of the variables with a probability . ↩ ↩2 -
Johan Håstad, “Computational limitations of small-depth circuits” (PhD thesis). ↩
-
This is not quite formal, but can be made formal by first drawing
, then drawing (which is okay to do because all coordinates of are independent). ↩ -
Note that we’re not using the factor
that we could have gotten from the binomial . In this case, it would only help us improve the constant slightly. ↩ -
Instead of applying this ORing trick, Håstad instead makes the induction hypothesis more general by allowing
and to be defined on a subset of the variables whereas remains a function over all variables ( is understood to leave the variables outside of unfixed). In that case, we can replace by its projection on , which doesn’t change anything because under the assumption , we have and . And once that’s done, the conditioning disappears because doesn’t depend on . ↩