Statement

Say you have some boolean-valued function f:{±1}n{±1} that is is not too biased, i.e. Var[f]Ω(1). What can you say about its influences?

By Poincaré’s inequality, we know that the total influence I[f] is lower bounded by the variance, so at least one of the influences must be

Ω(Var[f]n)Ω(1n).

The Kahn–Kalai–Linial theorem improves on this Ω(1/n). It shows that when the total influence I[f] is small, then paradoxically one of the individual influences must be large:

maxiInfi[f]2O(I[f]Var[f])2O(I[f]).

We can then use this bound when I[f]<o(logn) and use the trivial bound maxiInfi[f]Ω(I[f]n) when I[f]Ω(logn) to get

maxiInfi[f]Ω(lognn).

Low variance

When the variance is very small, we can use the edge-isoperimetric inequality to get a better bound on maxInfi[f]. Say that Pr[f=1]=α<1/2, then Var[f]=Θ(α) and

I[f]2αlog(1/α),

so

maxInfi[f]I[f]n=Var[f]Ω(log(1/α)n)=Var[f]Ω(log1Var[f]n).

This beats KKL when Var[f]=nω(1). Overall, the “scaled influence” (which corresponds to the average sensitivity over the smaller of the two halves f1(1) and f1(1)) is on the order of

maxiInfi[f]Var[f]=Ω(max(logn,log1Var[f])n).

This is tight for the tribes function.

Proof

We give two proofs, which only differ in the “midpoint” they pick

  • the first looks at the stable total influence I(ρ)[f], and is the most intuitive;
  • the second goes deeper down to the norms of the derivatives Dif, and makes clear exactly where we use the fact that f outputs boolean values.

By stable influence

In summary:

  • if the total influence is small, then it must be stable;
  • but small influences are unstable under noise;
  • so there must be some large influences.

Recall that the total influence is the “expected degree” (weighted by Fourier weight)

I[f]=kWk[f],

whereas the variance represents the amount of weight at positive degree

Var[f]=k=1nWk[f].

Intuitively, the fact that I[f] is not too large but Var[f] is significant tells us that there must be a lot of weight at low but positive degree, which should make the stable influence not too small.

Concretely, by Markov’s inequality, there can’t be more than Var[f]2 weight above degree 2I[f]Var[f] (otherwise the total influence would be larger). This means that there is at least Var[f]2 weight between degrees 1 and 2I[f]Var[f], and therefore we can lower bound (say) the 1/3-stable total influence by

I(1/3)[f]=3k+1kWk[f](32I[f]Var[f]+12I[f]Var[f])Var[f]29I[f]Var[f]I[f].

In particular, as long as Var[f]Ω(1) and I[f]O(1), we have I(1/3)[f]Ω(1).

On the other hand, by small-set expansion, we know that small influences are unstable:

Infi(1/3)[f]Infi[f]3/2,

so if every influence is small, the total stable influence must be small compared to the total influence:

I(1/3)[f]=Infi(1/3)[f]Infi[f]3/2I[f]maxiInfi[f]1/2.

Combining both bounds, we get

maxiInfi[f]81I[f]Var[f].

By derivative norms

To be completely honest, this turned out less insightful than I’d hoped.

For simplicity, here let’s assume that Var[f]Ω(1) and I[f]O(1), which we already show implies that the stable influence is large: I(1/3)Ω(1). We want to show that maxiInfi[f]Ω(1).

The stable influences are defined as the stability of the corresponding derivative, which by hypercontractivity we can upper bound by a lower-order1 norm:

Infi(1/3)[f]=Stab1/3[Dif]Dif4/32=E[|Dif(x)|4/3]3/2,

so we overall get

E[|Dif(x)|4/3]3/2I(1/3)[f]Ω(1).

On the other hand, the second-order norm of Dif gives the influence:

Infi[f]=Dif22=E[|Dif|2],

so we have

E[|Dif(x)|2]3/2I[f]maxiInfi[f]1/2O(maxiInfi[f]1/2).

So far, everything we’ve said applies equally to real-valued functions. For those, there is nothing contradictory with having E[|Dif(x)|2]3/2 being much smaller than E[|Dif(x)|4/3]3/2. Indeed, this would be the case if Dif often takes small (but nonzero) values, which get crushed by squaring but not crushed as much by raising to the power 4/3. And indeed, the last section gives a function for which this is the case.

The key thing is that when f has boolean outputs, then Dif(x) is always in {1,0,1}, so it doesn’t matter which power you take:

E[|Dif(x)|4/3]=E[|Dif(x)|2]=Pr[Dif(x)0].

Therefore, you can combine the two inequalities and get

maxiInfi[f]1/2Ω(1).

This is false for real-valued functions

Consider f(x):=(1+xin)n. This is the relative density of X where each Xi is an independent Bernoulli of mean 1+1/n2.

It’s clear from its expression that f^(S)=(1n)|S|, so the influences are

Infi[f]=Sif^(S)2=k=1n(n1k1)nkk=1n1n(ek1)k1O(1/n),

while the variance is

Var[f]=E[f2]E[f]2=Sf^(S)2f^()2=(1+1n)n12e1Ω(1),

so f doesn’t satisfy KKL.

Similarly, by independence, the entropy deficit D[X] is D[Xi]=Θ(1) and the influence-like quantities D[Xi|X[n]{i}] are D[Xi]=Θ(1/n).

  1. The stability multiplies the function with (a noisy version) of itself, so we can see it as a kind of squared 2-norm, whereas E[(Dif)4/3]3/2 is the squared 4/3-norm.