Ergodic chains

  • irreducible: can go from any x to any y
  • aperiodic: length of cycles from x to x have gcd = 1
    • often you can aperiodicity via lazification: adding self-loops to every node)
      • i.e. (1λ)PλI
      • this isguaranteed to not change the stationary distribution!

Fundamental theorem

Every ergodic chain PRn×n has a unique stationary distribution μ, and for any starting distribution ν,

limtνPt=μ.

Proof

What we need is a quantitative version of Data processing inequalities: P needs to be a strong contraction. For example, if

dTV(νP,νP).99 dTV(ν,ν),

then ν,νP,νP2, is Cauchy:

dTV(νPt,νPt).99min(t,t),

so it converges (and also the limit can’t differ between two starting points).

But there might not initially be a strong contraction (see example in slides).

Weak contraction

There is an optimal coupling π such that

dTV(ν,ν)=Pr(X,X)π[XX].

How the coupling evolves:

  • if Xi=Xi, use same transition;
  • else, evolve arbitrarily.

So we get Pr[Xi+1Xi+1]Pr[XiXi].

Strong contraction

If we know that P(x,y)>0 for all x,y, then in our coupling we’d have

Pr[Xi+1Xi+1](1ϵ)Pr[XiXi].

For general ergodic P, no strong contraction, but there is some t such that Pt(x,y)>0 for all x,y, so just do the same thing for Pt instead of P (and just lose a factor t).

Why?

Because if the gcd of the loops is 1, then you get loops of every long enough length.