Adapted from a talk by Prasanna Ramakrishnan at Stanford’s teach-us-anything seminar.

You probably know the drill about s-t connectivity:

  • DFS/BFS gets you TISP(m,n),1
  • Savitch’s gets you TISP(nO(logn),log2n).

Can you get the best of both worlds? Some believe TISP(poly(n),polylog(n)) is possible. But for now, the best we can get in polynomial time is n/2Ω(logn) space.2

Idea

Pras claims that if you’re given one week and the hint “find something in the middle of BFS and Savitch’s”, you could come up with it. I doubt I could.

The idea is to split the problem into two parts:

  • one BFS-like part that “splits the path from s to t into n/λ segments of length λ”;
  • one Savitch-like part that can check whether two nodes are at distance λ (as long as λ is small).

The BFS-like part makes a lot of concessions in time in order to cut space. The Savitch-like part makes a lot of concessions in space in order to cut time.

BFS-like part: cut it up

BFS requires a lot of space to store all the “equidistance shells”: for each d, it stores the setVd of all vertices at distance d from node s. One natural way to save on this is to only store one out of λ of those shells: there must be some remainder r such that the shells Vr,Vλ+r,V2λ+r, contain only O(n/λ) nodes.

How do you compute V(i+1)λ+r from the previous ones though? The idea is that vV(i+1)λ+r if and only if:

  • it’s at distance exactly λ from Viλ+r;
  • it’s not in any of the earlier shells V0,,Viλ+r1.

The first part is easy to check using two small-distance queries from each wViλ+r. And you can check the second part simply by checking that v is not a distance <λ from the previous shells we memorized Vr,,V(i1)λ+r. That’s a lot of queries, but we’re fine with polynomial blow-ups in time!

Overall, we only make a polynomial number of queries, and we need O(nlognλ) bits of memory to store all the shells Vr,Vλ+r,. So if there was some TISP(t(n,λ),s(n,λ)) way to check whether two points are at distance at most λ, we could solve STCON in

TISP(poly(n)t(n,λ),nlognλ+s(n,λ)).

This means that to get poly-time sublinear-space connectivity, we need

  • λω(logn);
  • t(n,λ)poly(n);
  • s(n,λ)o(n).

Savitch-like part

Let’s figure what time-space trade-off we can get from Savitch assuming s and t are at distance λ. Out of the box, it gets you nO(logλ) time and O(lognlogλ) space. Since λ is superconstant, this is definitely not going to be polynomial-time, but this is already something nontrivial: for example, setting λ=(logn)100 gets you TISP(nO(loglogn),n/(logn)99), which is incomparable with BFS and Savitch’s.

But really, out-of-the-box Savitch’s cares way too much about optimizing space. We only want sublinear space (we have this annoying nlognλ anyway), so we should be able to do much better!

The trick is to reduce Savitch’s branching factor at the expense of space. Instead of choosing a specific middle vertex among all n possible vertices, we’ll choose a collection of possible middle vertices among k collections of n/k vertices (decided arbitrarily, say by their remainder modulo k).

The specification of one recursion of Savitch’s then becomes:

Savitch(d,i[k],j[k],x{0,1}n/k): returns a vector y{0,1}n/k representing which nodes from collection j are reachable within distance d from at least one of the nodes in collection i that are indicated by a 1 in x.

To solve this recursive problem, first initialize y0n/k, then for each possible “middle collection” l[k]:

  • make a recursive call to find out the vector z{0,1}n/k of each possible node in collection l that is reachable within distance d/2 from at least one of the nodes in collection i indicated by x;
  • make a second recursive call to find out the vector w{0,1}n/k of each possible node in collection j that is reachable within distance d/2 from at least one of the nodes in collection l indicated by z;
  • set yyw (component-wise).

Overall, there are logλ recursion levels, the branching factor is k, and we need poly(n) time and O(n/k+logk) memory at each recursion level, so this gives

TISP(poly(n)kO(logλ),(logλ)(n/k+logk)).

Choosing parameters

Let’s ignore the +logk, it will end up being small anyway. We’re left with

TISP(poly(n)kO(logλ),nlognλ+nlogλk).

The log factors in the right-hand side are probably not going to be very different, so a reasonable first step is to equalize the denominators by setting k:=λ, and get

TISP(poly(n)λO(logλ),nlognλ).

Then, it only remains to solve for λO(logλ)poly(n), which holds as long as λ2O(logn). So the best3 bound we can get for polynomial time is

TISP(poly(n),n2Ω(logn)).
  1. TISP(t(n),s(n)) is the class of problems solvable in time O(t(n)) and space O(s(n)) simultaneously. 

  2. Greg Barnes, Jonathan F. Buss, Walter L. Ruzzo, Baruch Schieber, “A sublinear pace, polynomial time algorithm for directed s-t connectivity”. 

  3. Actually you get to choose whatever constant you want for the Ω()