The proof is due to Alex Samorodnitsky.1

Let f be a boolean function with Pr[f=1]=α. Then I[f]2αlog(1/α),

This is significantly better than Poincaré’s inequality, which is only linear in α when α is small. In fact, it is perfectly tight: if f is an AND of i variables, then Pr[f=1]=2i, and its total influence is 2i2i=2αlog(1/α).

Method 1: entropy

Here, we will only show I[f]H(α), where H(α) is the binary entropy function. Since H(α)αlog(1/α), this is only off by a factor 2.

Intuition

Suppose you know f(x), and you want to encode x cheaply using this knowledge. It turns out that one optimal way to do this (which gets you the H(α) savings you deserve) is: for i=1,,n encode xi according to the relative probability that xi=0 and xi=1, conditioned on the fact that you already know f(x) and x1,,xi1.

Now, if some for some i you get big savings on average over all x{0,1}n, this means that the proportion of xi=0 and xi=1 conditioned on f(x) and x1,,xi1 is usually pretty skewed to one side or another. This can only happen if Infi[f] is fairly high.2 So this justifies intuitively that you’ll be able to lower bound iInfi[f] by something on the order of H(α) (the total savings from knowing f(x) at the start).

Proof

Let X be a uniformly random input. Let’s consider the quantity H[X,f(X)]. On the one hand, H[X,f(X)]=H[X]=n. On the other hand,

H[X,f(X)]=H[f(X)]+H[Xf(X)]=H(α)+i=1nH[XiX[i1],f(X)]H(α)+i=1nH[XiX[n]{i},f(X)]=H(α)+i=1n(1Infi[f]).

The result follows by rearrangement.

Method 2: entropy deficit

The intuition of this method is basically the same, except that instead of conditioning on f(X), we focus entirely on the values for which f(x)=1. We will show that the influences must “account for” the entropy deficit of log1α.

Let Y be uniform over f1(1). Then D[Y]=log1α, and we can rewrite Infi[f] as

Infi[f]=2αPr[f(Yi)1]=2αD[Yi|Y[n]{i}].

Indeed,

  • when xf1(1) is sensitive in the ith direction, then Y[n]{i}=x[n]{i} implies Yi=xi, which means the entropy deficit “is locally 1”;
  • when x is not sensitive in the ith direction, then conditioned on Y[n]{i}=x[n]{i}, Yi is still uniform, which means the entropy deficit “is locally 0”.

This means the edge-isoperimetric inequality is equivalent to

i=1nD[Yi|Y[n]{i}]D[Y].

Which is true because conditioning can only increase entropy deficit:

i=1nD[Yi|Y[n]{i}]i=1nD[Yi|Y[i1]](chain rule)=D[Y].
  1. It first appeared in “Lower bounds for linear locally decodable codes and private information retrieval” by Oded Goldreich, Howard Karloff, Leonard J. Schulman and Luca Trevisan. 

  2. Note that this is basically the same argument as in the “janky proof” of Poincaré’s inequality