Adapted from lectures 10 and 11 of Ryan O’Donnell’s Fall 2017 graduate complexity theory class at CMU.

In Buying randomness with quantifiers, we saw that in some cases, we can trade randomness for or quantifiers, proving that BPP is in P/poly and Σ2P. In this note, we extract the essence of those two proofs and apply the tricks more generally, with a focus on Arthur-Merlin complexity classes.

An interlude on quantifier notation

Since both of the two tricks we’ve seen allowed us to “play with the quantifiers” in the definitions of complexity classes in specific ways, it would be useful to make this more systematic by encapsulating the definitions of many complexity classes using quantifiers only.

For example, Σ2P corresponds to the languages L for which there is a verifier V such that1

  • if xL, then y,z:V(x,y,z)=1;
  • if xL, then y,z:V(x,y,z)=0.

Because the quantifiers are in one case and in the other, we will simply call this class “/”. In general, by “Q1,Qk/Q1,,Qk” (where Qi,Qi are quantifiers in ,,Я), we mean the class of all languages L for which there is a verifier V such that

  • if xL, then Q1y1,,Qkyk:V(x,y1,,yk)=1;
  • if xL, then Q1yy,,Qkyk:V(x,y1,,yk)=0.

As such, the reader is encourage to verify that P=/, NP=/, coNP=/, Π2P=/, BPP=Я/Я, RP=Я/, coRP=/Я, MA=Я/Я, and AM=Я/Я. In general, to take the “co” of a class, you simply swap both sides of the slash.

The tricks in our hands

Let’s recall the tools at our disposal: three easy ones, and the two we just proved.

Weaken: Я

This one is purely logical: we can always turn a into a Я, and a Я into , because this only weakens the guarantee. In words: “if all, then most, and if most, then some”. As an example, this shows that RP=Я//=NP.

Specialize:

Another purely logical one: we can swap into , because all this does is to allow the quantifier to specify more. In words: “if a song is liked by everyone, then everyone has a song that they like”.

In fact, this principle also extends well to Я,2 if you think of Я as sort of “halfway in between and ”:

  • you can swap ЯЯ, because “if a song is liked by most people, then most people have a song that they like”;
  • you can swap ЯЯ, because “if most people like all songs, then all songs are liked by most people”.

Combine: =

If we have the same quantifier twice, then we can reduce it to a single quantifier. For example, suppose we were to define a dumb class /, i.e.

  • if xL, then y,z:V(x,y,z)=1;
  • if xL, then y,z:V(x,y,z)=0.

Then we could just combine strings y and z together:

  • if xL, then (y,z):V(x,y,z)=1;
  • if xL, then (y,z):V(x,y,z)=0.

And this is now functionally equivalent to /=NP.

Union bound: Я=Я

The key of the proof of BPPP/poly was to boost the success probability so much that we can go from “for all inputs x, most random strings r give the correct answer” to “most random strings r give the correct answer for all inputs x” by a union bound. In general, we can always3 flip Я into Я (even though this is the “wrong” direction in terms of direct implication).

In words: “if there’s more songs than people, and everyone likes all but a handful of songs, then most songs are liked by literally everyone”.

Squint: Я/ЯЯ/Я

In the proof that BPPΣ2P, we saw that if we have a set S of random strings covering almost all of {0,1}s, then the union of random offset copies (Sd(1))(Sd(s)) covers all of {0,1}s with high probability, while if S contains almost none of {0,1}s, then no matter what offsets you choose, the offset copies will still cover only a tiny fraction of {0,1}s. That is, after boosting the success probability,

  • Яr:V(x,r)=1 implies Яd(1),,d(s),r:V(x,rd(1))V(x,rd(s))=1;
  • Яr:V(x,r)=0 implies d(1),,d(s),Яr:V(x,rd(1))V(x,rd(s))=0.

So4 we can transform Я/Я into Я/Я.

In words: “an almost full jar will typically look full if you squint, but an almost empty jar will always look mostly empty no matter how you squint”.

Also, by taking the “co” of both sides, we also get Я/ЯЯ/Я. This corresponds to taking the intersection of the offset copies (Sd(1))(Sd(s)) instead of the union (and therefore the AND instead of the OR). This ensures that when S covers almost all of {0,1}s, the intersection is still big, while if S contains almost none of {0,1}s, the intersection will be completely empty with high probability.

A slurry of corollaries

We’ve done all the hard thinking work, now we get to play around with quantifiers!

A alternative characterization of BPP

The proof that BPPΣ2P that we saw before really went through an intermediate class Я/Я:

BPP=Я/ЯsquintЯ/Яweaken/=Σ2P.

As it turns out, Я/Я is equal to BPP, since

Я/ЯweakenЯЯ/ЯЯ=combineЯ/Я=BPP.

One-sided error for MA and AM

Both MA and AM correspond to two-round proof protocols between Arthur, a weak but honest mortal with the power of BPP, and Merlin, a powerful but untrustworthy wizard with unbounded computation power (Merlin kind of plays the role of NP here).

The two classes vary only in who goes first:

  • In MA:=Я/Я, Merlin first sends a purported “proof” that xL, then Arthur uses randomness to “verify” the proof.
  • In AM:=Я/Я, Arthur first uses randomness to create an NP-type “challenge” for Merlin (e.g. a SAT instance), then Merlin tries to find a “solution” to the challenge.

The same squinting trick that gives “absolute guarantees” for BPP at the cost of an extra quantifier can also give absolute guarantees for MA and AM, but this time for free: Merlin’s / quantifier can absorb one of the two quantifiers that appear.

Formally,

MA=Я/Яsquint(Я)/(Я)weaken/Я=combine/Я.

In effect, we asked Merlin to attach to his proof the offsets d(1),,d(s) that make sure that Arthur always accepts when xL. This means that there are no “false negatives” anymore. Also, since /Я/ by weakining, this shows that MAΣ2P.

Similarly, for AM,

AM=Я/Яsquint(Я)/(Я)weaken/Я=combine/Я.

This time, we used the “co” version of squinting, and Arthur is only drawing the offsets, while Merlin tries to find a random string such that the verifier accepts on all the offsets. When xL, Merlin is always able to succeed, so there are no “false negatives”. Also, since /Я/ by weakning, this implies AMΠ2P.

MAAM

On a surface level, MA and AM seem incomparable: they’re the same class with the order of quantifiers flipped, just like Σ2P and Π2P. But it turns out that MAAM: it’s always better for Arthur to go first.

  • For xL, this is true because Я logically implies Я: the only thing that changes between MA and AM is that Merlin now gets to see Arthur’s coin flips before sending his message, so that can only help him make Arthur accept.
  • For xL, on the other hand (and for the same reasons), Я does not directly imply Я: even if no message of Merlin can fool Arthur on average when Merlin goes first, once the order is flipped, Merlin might be able to do better by adaptively choosing which message to send based on Arthur’s message. But if the probability Arthur is fooled is small compared to the number of possible messages for Merlin, then most of the time Arthur cannot be fooled by any message.

In math, this gives

MA=Я/ЯspecializeЯ/Я=union boundЯ/Я=AM.

Overall, we can summarize what we know about these classes in this diagram (classes that are generally believed to be equal are shaded together):

Constant-round protocols are all in AM

One might think that by adding more rounds of interaction between Arthur and Merlin, we can keep extending the class of languages. But this is not the case! The reason is the intuition that we got from the previous section: it’s always better for Arthur to go before Merlin, so we can transform the protocol so that Arthur sends all of his messages first, then Merlin sends all of his messages.

For example, consider the class AMA where Arthur goes first and last. Informally, we know that MAAM, so it’s tempting to write AMAAAM=AM. This is not quite how math works but it’s fundamentally the right idea. More formally, we can write AMA as ЯЯ/ЯЯ,5 so we can pull the same trick where we flip Я/Я into Я/Я, then combine quantifiers.

AMA=ЯЯ/ЯЯspecializeЯЯ/ЯЯ=union boundЯЯ/ЯЯ=combineЯ/Я=AM.

What if NPBPP?

In complexity we often think of P as the definition of “efficient computation”. But what would happen if we used BPP instead?

One important result is to show that if NP was “easy” (i.e. P=NP), then dramatic things would happen (i.e. the polynomial hierarchy would collapse). What would happen if NPBPP?

If we had an NP oracle within BPP, then in particular, we could use it to solve Σ2P within MA: use Merlin for the first quantifier, then let Arthur use the NP oracle to handle the second quantifier:

Σ2P=/NPBPPЯ/Я=MA.

But we know MAAMΠ2P. So we would have Σ2PΠ2P and the polynomial hierarchy would collapse to the second level.

Appendix: I lied a bit

This appendix is boring, please don’t read it.

I claimed earlier that we can always replace Я by Я, and Я/Я by Я/Я. Well, it’s a bit more complicated. The thing is, these tricks required some change to the computation that follows:

  • For the “union bound” in ЯЯ to work, we need to boost the probability of success. This involves repeating the computation several times in parallel and taking the majority of the answers.
  • For the “squinting” trick in Я/ЯЯ/Я we need to first boost the probability of succes then run this boosted computation at several offsets d(1),,d(s) and take the OR.

This means that we should be able to take majorities or ORs in the underlying computation model. For BPP, this was no problem: once x and r are chosen, the remaining computation V(x,r) is deterministic polynomial time, so repeating poly(n) times and computing the majority is no big deal. On the other hand, imagine we’re looking at AM=Я/Я. Then the computation that follows the “for most” quantifier is an NP-type computation. Can we compute the majority of some NP computations within NP itself? Given that we cannot even easily negate the result of an NP computation (unless NP=coNP), this seems worrying.

However, it does turn out to be okay, and ultimately this comes down to the fact that you can push monotone operations like and inside quantifiers and :6 for example <div>[(\exists y_1 : P(y_1)) \and (\exists y_2 : P(y_2)) \Leftrightarrow \exists y_1, y_2: P(y_1) \and P(y_2).]</div> Concretely, for boosting probabilities, we get

  • if xL, then Яr1,,rk:Maji(1[y:V(x,ri,y))=1])=1;
  • if xL, then Яr1,,rk:Maji(1[y:V(x,ri,y)=0])=1.

Which is equivalent to

  • if xL, then Яr1,,rk,y1,,yk:Maji(V(x,ri,y))=1;
  • if xL, then Яr1,,rk,y1,,yk:Maji(V(x,ri,y))=0.
  1. All strings are understood to have length polynomial in n:=|x|

  2. In fact, as already mentioned in a footnote in Buying randomness with quantifiers, you can always move to the inside (because you can just choose not to specialize) and to the outside (because if it always works, it cannot meaningfully specialize): whatever (monotone) quantifier Q you might come up with, QQ and QQ. But we’re getting carried away! 

  3. Actually, this is only the case if the underlying computation model allows us to boost the probability of success, see more details in the appendix. 

  4. Actually, this is only the case if the underlying computation model allows us to boost the probability of success and compute ORs of polynomially many runs, see more details in the appendix. 

  5. Actually even the claim that AMA=ЯЯ/ЯЯ is not obvious. The protocol-based definition only promises an overall probability of success, not that Arthur “succeeds with high probability at every round”. You can (very carefully) prove it by parallel repetition, though. 

  6. in this note, we won’t need to boost the probability for a computation that itself includes Я. Ironically, for Я, it’s less easy, but you can indeed do it… as long as the probability of that Я is high enough. So as far as I can tell, if you want to boost the probability of some Я, you need to first boost the Я’s right of it.