$\require{mathtools} % %%% GENERIC MATH %%% % % Environments \newcommand{\al}[1]{\begin{align}#1\end{align}} % need this for \tag{} to work \renewcommand{\r}{\mathrm} % % Greek \newcommand{\eps}{\epsilon} \newcommand{\veps}{\varepsilon} \let\fi\phi % because it looks like an f \let\phi\varphi % because it looks like a p % % Miscellaneous shortcuts % .. over and under \newcommand{\ss}[1]{_{\substack{#1}}} \newcommand{\ob}{\overbrace} \newcommand{\ub}{\underbrace} \newcommand{\ol}{\overline} \newcommand{\tld}{\widetilde} \newcommand{\HAT}{\widehat} \newcommand{\f}{\frac} \newcommand{\s}[2]{#1 /\mathopen{}#2} \newcommand{\rt}{\sqrt} % .. relations \newcommand{\sr}{\stackrel} \newcommand{\sse}{\subseteq} \newcommand{\ce}{\coloneqq} \newcommand{\ec}{\eqqcolon} \newcommand{\ap}{\approx} \newcommand{\ls}{\lesssim} \newcommand{\gs}{\greatersim} % .. miscer \newcommand{\q}{\quad} \newcommand{\qq}{\qquad} \newcommand{\heart}{\heartsuit} % % Delimiters % (I needed to create my own because the MathJax version of \DeclarePairedDelimiter doesn't have \mathopen{} and that messes up the spacing) % .. one-part \newcommand{\p}[1]{\mathopen{}\left( #1 \right)} \newcommand{\b}[1]{\mathopen{}\left[ #1 \right]} \newcommand{\set}[1]{\mathopen{}\left\{ #1 \right\}} \newcommand{\abs}[1]{\mathopen{}\left\lvert #1 \right\rvert} \newcommand{\floor}[1]{\mathopen{}\left\lfloor #1 \right\rfloor} \newcommand{\ceil}[1]{\mathopen{}\left\lceil #1 \right\rceil} \newcommand{\inner}[1]{\mathopen{}\left\langle #1 \right\rangle} % .... (use phantom to force at least the standard height of double bars) \newcommand{\norm}[1]{\mathopen{}\left\lVert #1 \vphantom{f} \right\rVert} \newcommand{\frob}[1]{\norm{#1}_\mathrm{F}} %% .. two-part \newcommand{\incond}[2]{#1 \mathop{}\middle|\mathop{} #2} \newcommand{\cond}[2]{ {\left.\incond{#1}{#2}\right.}} \newcommand{\pco}[2]{\p{\incond{#1}{#2}}} \newcommand{\bco}[2]{\b{\incond{#1}{#2}}} \newcommand{\setco}[2]{\set{\incond{#1}{#2}}} \newcommand{\at}[2]{\left.#1\right|_{#2}} % ..... (use phantom to force at least the standard height of double bar) \newcommand{\oldpara}[2]{#1\vphantom{f} \mathop{}\middle\|\mathop{} #2} %\newcommand{\para}[2]{#1\vphantom{f} \mathop{}\middle\|\mathop{} #2} \newcommand{\para}[2]{\mathchoice{\begin{matrix}#1\\\hdashline#2\end{matrix}}{\begin{smallmatrix}#1\\\hdashline#2\end{smallmatrix}}{\begin{smallmatrix}#1\\\hdashline#2\end{smallmatrix}}{\begin{smallmatrix}#1\\\hdashline#2\end{smallmatrix}}} \newcommand{\ppa}[2]{\p{\para{#1}{#2}}} \newcommand{\bpa}[2]{\b{\para{#1}{#2}}} %\newcommand{\bpaco}[4]{\bpa{\incond{#1}{#2}}{\incond{#3}{#4}}} \newcommand{\bpaco}[4]{\bpa{\cond{#1}{#2}}{\cond{#3}{#4}}} % % Levels of closeness \newcommand{\scirc}[1]{\sr{\circ}{#1}} \newcommand{\sdot}[1]{\sr{.}{#1}} \newcommand{\slog}[1]{\sr{\log}{#1}} \newcommand{\createClosenessLevels}[7]{ \newcommand{#2}{\mathrel{(#1)}} \newcommand{#3}{\mathrel{#1}} \newcommand{#4}{\mathrel{#1\!\!#1}} \newcommand{#5}{\mathrel{#1\!\!#1\!\!#1}} \newcommand{#6}{\mathrel{(\sdot{#1})}} \newcommand{#7}{\mathrel{(\slog{#1})}} } \let\lt\undefined \let\gt\undefined % .. vanilla versions (is it within a constant?) \newcommand{\ez}{\scirc=} \newcommand{\eq}{\simeq} \newcommand{\eqq}{\mathrel{\eq\!\!\eq}} \newcommand{\eqqq}{\mathrel{\eq\!\!\eq\!\!\eq}} \newcommand{\lez}{\scirc\le} \newcommand{\lq}{\preceq} \newcommand{\lqq}{\mathrel{\lq\!\!\lq}} \newcommand{\lqqq}{\mathrel{\lq\!\!\lq\!\!\lq}} \newcommand{\gez}{\scirc\ge} \newcommand{\gq}{\succeq} \newcommand{\gqq}{\mathrel{\gq\!\!\gq}} \newcommand{\gqqq}{\mathrel{\gq\!\!\gq\!\!\gq}} \newcommand{\lz}{\scirc<} \newcommand{\lt}{\prec} \newcommand{\ltt}{\mathrel{\lt\!\!\lt}} \newcommand{\lttt}{\mathrel{\lt\!\!\lt\!\!\lt}} \newcommand{\gz}{\scirc>} \newcommand{\gt}{\succ} \newcommand{\gtt}{\mathrel{\gt\!\!\gt}} \newcommand{\gttt}{\mathrel{\gt\!\!\gt\!\!\gt}} % .. dotted versions (is it equal in the limit?) \newcommand{\ed}{\sdot=} \newcommand{\eqd}{\sdot\eq} \newcommand{\eqqd}{\sdot\eqq} \newcommand{\eqqqd}{\sdot\eqqq} \newcommand{\led}{\sdot\le} \newcommand{\lqd}{\sdot\lq} \newcommand{\lqqd}{\sdot\lqq} \newcommand{\lqqqd}{\sdot\lqqq} \newcommand{\ged}{\sdot\ge} \newcommand{\gqd}{\sdot\gq} \newcommand{\gqqd}{\sdot\gqq} \newcommand{\gqqqd}{\sdot\gqqq} \newcommand{\ld}{\sdot<} \newcommand{\ltd}{\sdot\lt} \newcommand{\lttd}{\sdot\ltt} \newcommand{\ltttd}{\sdot\lttt} \newcommand{\gd}{\sdot>} \newcommand{\gtd}{\sdot\gt} \newcommand{\gttd}{\sdot\gtt} \newcommand{\gtttd}{\sdot\gttt} % .. log versions (is it equal up to log?) \newcommand{\elog}{\slog=} \newcommand{\eqlog}{\slog\eq} \newcommand{\eqqlog}{\slog\eqq} \newcommand{\eqqqlog}{\slog\eqqq} \newcommand{\lelog}{\slog\le} \newcommand{\lqlog}{\slog\lq} \newcommand{\lqqlog}{\slog\lqq} \newcommand{\lqqqlog}{\slog\lqqq} \newcommand{\gelog}{\slog\ge} \newcommand{\gqlog}{\slog\gq} \newcommand{\gqqlog}{\slog\gqq} \newcommand{\gqqqlog}{\slog\gqqq} \newcommand{\llog}{\slog<} \newcommand{\ltlog}{\slog\lt} \newcommand{\lttlog}{\slog\ltt} \newcommand{\ltttlog}{\slog\lttt} \newcommand{\glog}{\slog>} \newcommand{\gtlog}{\slog\gt} \newcommand{\gttlog}{\slog\gtt} \newcommand{\gtttlog}{\slog\gttt} % % Miscellaneous \newcommand{\LHS}{\mathrm{LHS}} \newcommand{\RHS}{\mathrm{RHS}} % .. operators \DeclareMathOperator{\poly}{poly} \DeclareMathOperator{\polylog}{polylog} \DeclareMathOperator{\quasipoly}{quasipoly} \DeclareMathOperator{\negl}{negl} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} % .. functions \DeclareMathOperator{\id}{id} \DeclareMathOperator{\sign}{sign} \DeclareMathOperator{\err}{err} \DeclareMathOperator{\ReLU}{ReLU} % .. analysis \let\d\undefined \newcommand{\d}{\operatorname{d}\mathopen{}} \newcommand{\df}[2]{\f{\d #1}{\d #2}} \newcommand{\ds}[2]{\s{\d #1}{\d #2}} \newcommand{\part}{\partial} \newcommand{\partf}[2]{\f{\part #1}{\part #2}} \newcommand{\parts}[2]{\s{\part #1}{\part #2}} \newcommand{\grad}[1]{\mathop{\nabla\!_{#1}}} % .. sets \newcommand{\N}{\mathbb{N}} \newcommand{\Z}{\mathbb{Z}} \newcommand{\R}{\mathbb{R}} \newcommand{\C}{\mathbb{C}} \newcommand{\F}{\mathbb{F}} \newcommand{\zo}{\set{0,1}} \newcommand{\pmo}{\set{\pm 1}} % %%% SPECIALIZED MATH %%% % % Logic \renewcommand{\and}{\wedge} \newcommand{\AND}{\bigwedge} \newcommand{\or}{\vee} \newcommand{\OR}{\bigvee} \newcommand{\xor}{\oplus} \newcommand{\XOR}{\bigoplus} \newcommand{\union}{\cup} \newcommand{\inter}{\cap} \newcommand{\UNION}{\bigcup} \newcommand{\INTER}{\bigcap} \newcommand{\comp}{\overline} \newcommand{\true}{\r{true}} \newcommand{\false}{\r{false}} \newcommand{\tf}{\set{\true,\false}} \DeclareMathOperator{\One}{\mathbb{1}} \DeclareMathOperator{\1}{\mathbb{1}} % % Linear algebra \renewcommand{\span}{\mathrm{span}} \DeclareMathOperator{\rank}{rank} \DeclareMathOperator{\proj}{proj} \DeclareMathOperator{\dom}{dom} \DeclareMathOperator{\Img}{Im} \newcommand{\transp}{\mathsf{T}} \renewcommand{\t}{^\transp} % ... named tensors \newcommand{\namedtensorstrut}{\vphantom{fg}} % milder than \mathstrut \newcommand{\name}[1]{\mathsf{\namedtensorstrut #1}} \newcommand{\nbin}[2]{\mathbin{\underset{\substack{#1}}{\namedtensorstrut #2}}} \newcommand{\ndot}[1]{\nbin{#1}{\odot}} \newcommand{\ncat}[1]{\nbin{#1}{\oplus}} \newcommand{\nsum}[1]{\sum\limits_{\substack{#1}}} \newcommand{\nfun}[2]{\mathop{\underset{\substack{#1}}{\namedtensorstrut\mathrm{#2}}}} \newcommand{\ndef}[2]{\newcommand{#1}{\name{#2}}} \newcommand{\nt}[1]{^{\transp(#1)}} % % Probability \newcommand{\Normal}{\mathcal{N}} \let\Pr\undefined \DeclareMathOperator*{\Pr}{Pr} \DeclareMathOperator*{\G}{\mathbb{G}} \DeclareMathOperator*{\Odds}{Od} \DeclareMathOperator*{\E}{E} \DeclareMathOperator*{\Var}{Var} \DeclareMathOperator*{\Cov}{Cov} \DeclareMathOperator*{\corr}{corr} \DeclareMathOperator*{\median}{median} \newcommand{\dTV}{d_{\mathrm{TV}}} \newcommand{\dHel}{d_{\mathrm{Hel}}} \newcommand{\dJS}{d_{\mathrm{JS}}} % ... information theory \let\H\undefined \DeclareMathOperator*{\H}{H} \DeclareMathOperator*{\I}{I} \DeclareMathOperator*{\D}{D} % %%% SPECIALIZED COMPUTER SCIENCE %%% % % Complexity classes % .. classical \newcommand{\Poly}{\mathsf{P}} \newcommand{\NP}{\mathsf{NP}} \newcommand{\PH}{\mathsf{PH}} \newcommand{\PSPACE}{\mathsf{PSPACE}} \renewcommand{\L}{\mathsf{L}} % .. probabilistic \newcommand{\formost}{\mathsf{Я}} \newcommand{\RP}{\mathsf{RP}} \newcommand{\BPP}{\mathsf{BPP}} \newcommand{\MA}{\mathsf{MA}} \newcommand{\AM}{\mathsf{AM}} \newcommand{\IP}{\mathsf{IP}} \newcommand{\RL}{\mathsf{RL}} % .. circuits \newcommand{\NC}{\mathsf{NC}} \newcommand{\AC}{\mathsf{AC}} \newcommand{\ACC}{\mathsf{ACC}} \newcommand{\TC}{\mathsf{TC}} \newcommand{\Ppoly}{\mathsf{P}/\poly} \newcommand{\Lpoly}{\mathsf{L}/\poly} % .. resources \newcommand{\TIME}{\mathsf{TIME}} \newcommand{\SPACE}{\mathsf{SPACE}} \newcommand{\TISP}{\mathsf{TISP}} \newcommand{\SIZE}{\mathsf{SIZE}} % .. keywords \newcommand{\co}{\mathsf{co}} \newcommand{\Prom}{\mathsf{Promise}} % % Boolean analysis \newcommand{\harpoon}{\!\upharpoonright\!} \newcommand{\rr}[2]{#1\harpoon_{#2}} \newcommand{\Fou}[1]{\widehat{#1}} \DeclareMathOperator{\Ind}{\mathrm{Ind}} \DeclareMathOperator{\Inf}{\mathrm{Inf}} \DeclareMathOperator{\Der}{\mathrm{D}} \DeclareMathOperator{\Stab}{\mathrm{Stab}} \DeclareMathOperator{\T}{T} \DeclareMathOperator{\sens}{\mathrm{s}} \DeclareMathOperator{\bsens}{\mathrm{bs}} \DeclareMathOperator{\fbsens}{\mathrm{fbs}} \DeclareMathOperator{\Cert}{\mathrm{C}} \DeclareMathOperator{\DT}{\mathrm{DT}} \DeclareMathOperator{\CDT}{\mathrm{CDT}} % canonical \DeclareMathOperator{\ECDT}{\mathrm{ECDT}} \DeclareMathOperator{\CDTv}{\mathrm{CDT_{vars}}} \DeclareMathOperator{\ECDTv}{\mathrm{ECDT_{vars}}} \DeclareMathOperator{\CDTt}{\mathrm{CDT_{terms}}} \DeclareMathOperator{\ECDTt}{\mathrm{ECDT_{terms}}} \DeclareMathOperator{\CDTw}{\mathrm{CDT_{weighted}}} \DeclareMathOperator{\ECDTw}{\mathrm{ECDT_{weighted}}} \DeclareMathOperator{\AvgDT}{\mathrm{AvgDT}} \DeclareMathOperator{\PDT}{\mathrm{PDT}} % partial decision tree \DeclareMathOperator{\DTsize}{\mathrm{DT_{size}}} \DeclareMathOperator{\W}{\mathbf{W}} % .. functions (small caps sadly doesn't work) \DeclareMathOperator{\Par}{\mathrm{Par}} \DeclareMathOperator{\Maj}{\mathrm{Maj}} \DeclareMathOperator{\HW}{\mathrm{HW}} \DeclareMathOperator{\Th}{\mathrm{Th}} \DeclareMathOperator{\Tribes}{\mathrm{Tribes}} \DeclareMathOperator{\RotTribes}{\mathrm{RotTribes}} \DeclareMathOperator{\CycleRun}{\mathrm{CycleRun}} \DeclareMathOperator{\SAT}{\mathrm{SAT}} \DeclareMathOperator{\UniqueSAT}{\mathrm{UniqueSAT}} % % Dynamic optimality \newcommand{\OPT}{\mathsf{OPT}} \newcommand{\Alt}{\mathsf{Alt}} \newcommand{\Funnel}{\mathsf{Funnel}} % % Alignment \DeclareMathOperator{\Amp}{\mathrm{Amp}} % %%% TYPESETTING %%% % % In text \renewcommand{\th}{^{\mathrm{th}}} \newcommand{\degree}{^\circ} % % Fonts % .. bold \newcommand{\BA}{\boldsymbol{A}} \newcommand{\BB}{\boldsymbol{B}} \newcommand{\BC}{\boldsymbol{C}} \newcommand{\BD}{\boldsymbol{D}} \newcommand{\BE}{\boldsymbol{E}} \newcommand{\BF}{\boldsymbol{F}} \newcommand{\BG}{\boldsymbol{G}} \newcommand{\BH}{\boldsymbol{H}} \newcommand{\BI}{\boldsymbol{I}} \newcommand{\BJ}{\boldsymbol{J}} \newcommand{\BK}{\boldsymbol{K}} \newcommand{\BL}{\boldsymbol{L}} \newcommand{\BM}{\boldsymbol{M}} \newcommand{\BN}{\boldsymbol{N}} \newcommand{\BO}{\boldsymbol{O}} \newcommand{\BP}{\boldsymbol{P}} \newcommand{\BQ}{\boldsymbol{Q}} \newcommand{\BR}{\boldsymbol{R}} \newcommand{\BS}{\boldsymbol{S}} \newcommand{\BT}{\boldsymbol{T}} \newcommand{\BU}{\boldsymbol{U}} \newcommand{\BV}{\boldsymbol{V}} \newcommand{\BW}{\boldsymbol{W}} \newcommand{\BX}{\boldsymbol{X}} \newcommand{\BY}{\boldsymbol{Y}} \newcommand{\BZ}{\boldsymbol{Z}} \newcommand{\Ba}{\boldsymbol{a}} \newcommand{\Bb}{\boldsymbol{b}} \newcommand{\Bc}{\boldsymbol{c}} \newcommand{\Bd}{\boldsymbol{d}} \newcommand{\Be}{\boldsymbol{e}} \newcommand{\Bf}{\boldsymbol{f}} \newcommand{\Bg}{\boldsymbol{g}} \newcommand{\Bh}{\boldsymbol{h}} \newcommand{\Bi}{\boldsymbol{i}} \newcommand{\Bj}{\boldsymbol{j}} \newcommand{\Bk}{\boldsymbol{k}} \newcommand{\Bp}{\boldsymbol{p}} \newcommand{\Bq}{\boldsymbol{q}} \newcommand{\Br}{\boldsymbol{r}} \newcommand{\Bs}{\boldsymbol{s}} \newcommand{\Bt}{\boldsymbol{t}} \newcommand{\Bu}{\boldsymbol{u}} \newcommand{\Bv}{\boldsymbol{v}} \newcommand{\Bw}{\boldsymbol{w}} \newcommand{\Bx}{\boldsymbol{x}} \newcommand{\By}{\boldsymbol{y}} \newcommand{\Bz}{\boldsymbol{z}} \newcommand{\Balpha}{\boldsymbol{\alpha}} \newcommand{\Bbeta}{\boldsymbol{\beta}} \newcommand{\Bgamma}{\boldsymbol{\gamma}} \newcommand{\Bdelta}{\boldsymbol{\delta}} \newcommand{\Beps}{\boldsymbol{\eps}} \newcommand{\Bveps}{\boldsymbol{\veps}} \newcommand{\Bzeta}{\boldsymbol{\zeta}} \newcommand{\Beta}{\boldsymbol{\eta}} \newcommand{\Btheta}{\boldsymbol{\theta}} \newcommand{\Biota}{\boldsymbol{\iota}} \newcommand{\Bkappa}{\boldsymbol{\kappa}} \newcommand{\Blambda}{\boldsymbol{\lambda}} \newcommand{\Bmu}{\boldsymbol{\mu}} \newcommand{\Bnu}{\boldsymbol{\nu}} \newcommand{\Bxi}{\boldsymbol{\xi}} \newcommand{\Bomicron}{\boldsymbol{\omicron}} \newcommand{\Bpi}{\boldsymbol{\pi}} \newcommand{\Brho}{\boldsymbol{\rho}} \newcommand{\Bsigma}{\boldsymbol{\sigma}} \newcommand{\Btau}{\boldsymbol{\tau}} \newcommand{\Bupsilon}{\boldsymbol{\upsilon}} \newcommand{\Bphi}{\boldsymbol{\phi}} \newcommand{\Bfi}{\boldsymbol{\fi}} \newcommand{\Bchi}{\boldsymbol{\chi}} \newcommand{\Bpsi}{\boldsymbol{\psi}} \newcommand{\Bomega}{\boldsymbol{\omega}} % .. calligraphic \newcommand{\CA}{\mathcal{A}} \newcommand{\CB}{\mathcal{B}} \newcommand{\CC}{\mathcal{C}} \newcommand{\CD}{\mathcal{D}} \newcommand{\CE}{\mathcal{E}} \newcommand{\CF}{\mathcal{F}} \newcommand{\CG}{\mathcal{G}} \newcommand{\CH}{\mathcal{H}} \newcommand{\CI}{\mathcal{I}} \newcommand{\CJ}{\mathcal{J}} \newcommand{\CK}{\mathcal{K}} \newcommand{\CL}{\mathcal{L}} \newcommand{\CM}{\mathcal{M}} \newcommand{\CN}{\mathcal{N}} \newcommand{\CO}{\mathcal{O}} \newcommand{\CP}{\mathcal{P}} \newcommand{\CQ}{\mathcal{Q}} \newcommand{\CR}{\mathcal{R}} \newcommand{\CS}{\mathcal{S}} \newcommand{\CT}{\mathcal{T}} \newcommand{\CU}{\mathcal{U}} \newcommand{\CV}{\mathcal{V}} \newcommand{\CW}{\mathcal{W}} \newcommand{\CX}{\mathcal{X}} \newcommand{\CY}{\mathcal{Y}} \newcommand{\CZ}{\mathcal{Z}} % .. typewriter \newcommand{\TA}{\mathtt{A}} \newcommand{\TB}{\mathtt{B}} \newcommand{\TC}{\mathtt{C}} \newcommand{\TD}{\mathtt{D}} \newcommand{\TE}{\mathtt{E}} \newcommand{\TF}{\mathtt{F}} \newcommand{\TG}{\mathtt{G}} \newcommand{\TH}{\mathtt{H}} \newcommand{\TI}{\mathtt{I}} \newcommand{\TJ}{\mathtt{J}} \newcommand{\TK}{\mathtt{K}} \newcommand{\TL}{\mathtt{L}} \newcommand{\TM}{\mathtt{M}} \newcommand{\TN}{\mathtt{N}} \newcommand{\TO}{\mathtt{O}} \newcommand{\TP}{\mathtt{P}} \newcommand{\TQ}{\mathtt{Q}} \newcommand{\TR}{\mathtt{R}} \newcommand{\TS}{\mathtt{S}} \newcommand{\TT}{\mathtt{T}} \newcommand{\TU}{\mathtt{U}} \newcommand{\TV}{\mathtt{V}} \newcommand{\TW}{\mathtt{W}} \newcommand{\TX}{\mathtt{X}} \newcommand{\TY}{\mathtt{Y}} \newcommand{\TZ}{\mathtt{Z}}$

Adapted from the 2016 PUMaC power round (a team-based math competition at Princeton), written by Eric Neyman.

This note shows how to get pseudorandom generators from one-way permutations.

The main players

What we want: pseudorandom generators

A pseudorandom generator is a way to turn a small number of uniformly random bits into a larger number of bits which look for all intents and purposes like they’re uniformly random. Where here “all intents and purposes” means “polynomial-time algorithms trying to tell the difference”. More precisely, $g : \zo^n \to \zo^{l}$ is a pseudorandom generator if $l>n$1 and:

  • $g$ is computable in polynomial time;
  • $g(x)$ is computationally indistinguishable from uniform: that is, for all PPT algorithms $A$,
\[ |\Pr_{x \sim \zo^n}[A(g(x)) = 1] - \Pr_{y \sim \zo^l}[A(y) = 1]| \leq \negl(n), \]

where $\negl(n)$ means $n^{-\omega(1)}$ (smaller than any polynomial).

Pseudorandom generators are great! They can take any application that normally requires a large number of random bits and make it require much fewer. One such application is private-key encryption. If you XOR a message with a uniformly random string of the same length, the result will be uniformly random. This means that, as long as you’re willing to exchange encryption keys that have the length as your messages, you can get perfect security: this is the “one-time pad”. Pseudorandom generators allow you to drastically reduce the size of the encryption key by turning short truly-random keys into very long pseudorandom “keys” that work just as well as uniform against PPT adversaries.

What we have: one-way permutations

Suppose you have a one-way permutation: a bijective function $f : \zo^n \to \zo^n$ such that computing $f$ is easy, but computing its inverse $f^{-1}$ is hard on average.

Exponentiation modulo $p$ is a good candidate: fix some prime $p$ and some generator $g \in \Z_p^*$,2 and let $f(x) \coloneqq g^x$. This is a bijection between $\set{0, \ldots, p-2}$ (the set of exponents) and $\Z_p^*$ (the set of non-zero remainders), not quite $\zo^n$ and $\zo^n$, but ehh close enough.3 More importantly:

  • $f$ is computable in polynomial time: if you know $x$, you can compute $g^x$ (using repeated squaring);
  • $f^{-1}$ seems hard to compute: suppose you know $g^x$, it’s not clear how to compute $x$.

The discrete log hypothesis states that no PPT algorithm $A$ has a significant chance to compute $x$ from $g^x$ on average. That is, for all PPT $A$, $\Pr_x[A(g^x) = x] \leq \negl(n)$ (here $n$ is the input length, which is $\Theta(\log p)$).

The middle man: hardcore bits

Hardcore bits

Suppose you had a string $x$ of $n$ truly random bits, and wanted to (deterministically) turn it into a string of $n+1$ bits that still looks random. Perhaps the most natural way to try this is to first write out the $n$ bits of $x$ as is (so at least we know that those look random), then add some additional bit $b$ that is “hard to guess” given $x$: $g(x) \coloneqq x \circ b(x)$.

The issue, though, is that the only way to make $b(x)$ “hard to guess” given $x$ is to make $b$ a hard function. But $b$ is a part of $g$, and we want $g$ to be easy! So that’s no good for us.

The reason why the above can’t work is that we’re giving $x$ away in $g(x)$: any opponent who knows $g(x)$ would then be on equal footing with us in terms of computing $b(x)$. So instead, let’s “hide” $x$ using our one-way permutation $f$: let $g(x) \coloneqq f(x) \circ b(x)$. Since $f$ is a permutation, the first $n$ bits are still uniformly random. And now, it’s hard for the adversary to figure out what $x$ is, so we can at least hope to find some $b$ such that $b(x)$ is easy to compute from $x$ but hard to compute if all you know is $f(x)$.

This leads us to the notion of a “hardcore bit”. A hardcore bit for a one-way permutation $f$ is a function $b: \zo^n \to \zo$ such that no PPT algorithm does significantly better than random chance at guessing $b(x)$ given $f(x)$ on average: for all PPT $A$, $\Pr_x[A(f(x)) = b(x)] = 1/2 \pm \negl(n)$.

Hardcore bits are good enough

We claim that a hardcore bit is all we need to get pseudorandom generators: that is, if $f$ is a one-way permutation and $b$ is a hardcore bit for $f$, then the resulting $g(x) \coloneqq f(x) \circ b(x)$ is a pseudorandom generator.

This will work via a hybrid argument. Denote by $U_n$ the uniform distribution on $n$ bits, and consider the following three distributions:

\[f(x) \circ b(x) \quad f(x) \circ U_1 \quad U_{n+1}.\]

We claim that the first two are computationally indistinguishable, and so are the last two. Therefore $f(x) \circ b(x)$ is indistinguishable from uniform, as desired.

First, let’s show that $f(x) \circ b(x)$ is indistinguishable from $f(x) \circ U_1$. To justify this, we claim that if there were some algorithm $A$ that distinguishes between $f(x) \circ b(x)$ and $f(x) \circ U_1$, then we can get some algorithm $A'$ which guesses $b(x)$ from $f(x)$ with better than random chance (contradicting the hardcore-ness of $b$). This is somewhat believable: for example, if we have some $A$ such that $\Pr_x[A(f(x), b(x))=1]$ is significantly bigger than $\Pr_{x,b'}[A(f(x), b')=1]$, then it seems like either $A(f(x),1)$ or $\neg A(f(x),0)$ should be significantly correlated with $b(x)$. We prove this formally in Distinguish from random then guess.

Second, we need to show that $f(x) \circ U_1$ is indistinguishable from $U_{n+1}$. Actually, those are even statistically indistinguishable: since $f$ is a permutation and $x$ is uniform, $f(x)$ must be uniform too, so $f(x) \circ U_1 \approx U_n \circ U_1 \approx U_{n+1}$.

Hardcore bits exist

Looking for something unguessable

Suppose you have some one-way permutation $f$. What could be hard to guess about $x$ given $f(x)$?

One very reasonable proposal would be that any bit of $x$ is hard to guess. For example, we could try out $b(x) \coloneqq x_1$. While this might work for some permutations $f$, there is nothing in the definition of one-way permutation that forces input bits to be hardcore. Indeed, even for our candidate $f(x) \coloneqq g^x \pmod{p}$, the parity of $x$ is actually easy to guess given $g^x$: just check whether $(g^x)^{(p-1)/2} \equiv 1 \pmod{p}$. If this is the case, then $x$ is even; otherwise it’s odd. And in general, one can easily fabricate one-way permutations where the $i\th$ bit is easy to guess: just take some one-way permutation $f$ on $n-1$ bits and let $f'(x) \coloneqq f(x_{[n]\setminus \set{i}}) \circ x_i$.

A second natural attempt would be to take the parity of all bits $b(x) \coloneqq x_1 \xor \cdots \xor x_n$. Indeed, at least one bit has to be somewhat hard to guess (otherwise we would have some success in guessing $x$ itself). But this can also fail in some cases: again, taking a one-way permutation $f$ on $n-1$ bits, you can make $f'(x) \coloneqq f(x_{[n-1]}) \circ (x_1 \xor \cdots \xor x_n)$.4

Intuitively, this last attempt failed because we picked a specific parity, which can be revealed by the one-way function if it’s feeling contrarian. But it seems like a random parity (a parity of a random subset of the variables $x_i$) might be hard to guess? What does this even mean?

The way to make this intuition formal is cheat a little bit. We won’t give a hardcore bit for $f$ itself, but for a variant of it. Concretely, split the input into two halves $x \in \zo^n$ and $y \in\zo^n$. If $f$ is a one-way permutation on $n$ bits, we can form the one way permutation

\[f'(x,y) \coloneqq f(x) \circ y\]

(even though we’re giving the last $n$ bits away, this is secure because $x$ is hard to guess from $f(x)$). We can now define

\[b(x,y) \coloneqq x \cdot y,\]

where $x \cdot y = x_1y_1 \xor \cdots \xor x_ny_n$ is the inner product between $x$ and $y$. Intuitively, $y$ describes which indices of $x$ should be included in the parity (this will be known to the adversary), and $b(x,y)$ computes that parity.5

How can we show that $x \cdot y$ is hard to guess given $f(x)$ and $y$? As is usual in crypto, the solution is to prove the contrapositive by giving an algorithm that breaks our assumption. Suppose you have some adversary $A$ that can compute $x \cdot y$ given $f(x)$ and $y$ (when $x,y$ are drawn uniformly), we will build some adversary $A'$ that can recover $x$ given $f(x)$.

Attempt 1: wishfully hoping that things go right

Now, some of the difficulty in this is that $A$ cannot really “compute” $x \cdot y$ given $f(x)$ and $y$. All it can do is make a pretty mediocre guess. But for now, assume that it can make a pretty good guess: let’s say that for this $x$, it gives the right answer on a $\geq 1-1/n^2$ fraction of value of $y$. Then here is a strategy:

  • receive $f(x)$;
  • draw $2n$ different values for $y$ uniformly at random: $y^{(1)}, \ldots, y^{(2n)}$;
  • run $A$ on inputs $(f(x), y^{(1)}), \ldots, (f(x), y^{(2n)})$;
  • if everything went well, this gives us $x \cdot y^{(1)}, \ldots, x \cdot y^{(2n)}$;
  • based on this “system” of $2n$ “equations”, find $x$ using Gaussian elimination.

For this, two things need to work well.

  • $y^{(1)}, \ldots, y^{(2n)}$ need to generate all of $\zo^n$. This is true with high probability (I think something like $\geq 1-2^{-\Omega(n)}$).
  • $A$ needs to succeed on all $2n$ queries that were made to it. Since all $y^{(j)}$ are uniform, each query has probability at least $1 - 1/n^2$ to succeed, so by a union bound, they all succeed simultaneously with probability at least $1-2/n$.

Overall, we get probability of success at least $1-2^{-\Omega(n)} - 2/n$, which is significantly bigger than the $1/\poly(n)$ we were aiming for.

Attempt 2: getting strategic

If $A$ only guesses $x\cdot y$ with probability slightly bigger than $1/2$, then we cannot hope that all the guesses $x \cdot y^{(j)}$ will be true simultaneously, so we cannot guarantee that all “equations” in the “system” will be correct, and Gaussian elimination cannot be guaranteed to work.

But we still have the guarantee that most equations will be correct. Could we still recover $x$ from a random set of noisy equations? Well, yes and no. Yes in the sense that if you’re given enough noisy equations,6 you can recover $x$ with high confidence. But no in the sense that this seems to require a lot of time and equations.7 :(

So what do you do? Well, if we can’t count on the problem being solvable given a random set of equations, you need to be more strategic about which equations you ask for. But you can’t just ask arbitrary equations and expect to get the right answer most of the time, because you only have the guarantee that $A$ will give the right answer on most values of $x,y$. So we have to walk a fine line between being strategic about which queries you ask $A$, and making sure that we’re still sticking to a uniform distribution most of the time.

More precisely, we’ll assume that for this $x$, $A$ gives the right answer for $90\%$ of the $y$’s. Let’s be modest and let’s say we just want to figure out a specific coordinate $x_i$. Suppose we draw some string $y$ at random, then call $A$ on $(f(x),y)$ and $(f(x), y \xor e_i)$ (where $e_i$ is the $i\th$ unit vector). If both queries succeed, we can then get $x_i$ by computing

\[A(f(x),y) \cdot A(f(x),y \xor e_i) = x \cdot y \xor x \cdot (y \xor e_i) %= x \cdot y \xor x \cdot y \xor x \cdot e_i = x \cdot e_i = x_i\]

Note that both $y$ and $y \xor e_i$ are both (separately) uniform in $\zo^n$, so both queries will output the right answer with probability $\geq 90\%$. Thus, by a union bound, there is $\geq 80\%$ chance that both queries will return the right answer. This means that if we ask do this enough times and compute the majority, we can find $x_i$ with high confidence!

But if $A$ only gives the right answer for $3/4$ of the strings $y$ or less, then this doesn’t give anything. :(

Attempt 3: a bit of magical assistance

The previous attempt (along with all naive extensions I was able to come up with) is fundamentally limited by the fact that it needs two queries to $A$ to simultaneously give the right answer in order to work. If only we could get rid of one of them, and get a decent guess at $x_i$ from a single query to $A$!

Let’s say that an angel decends from the sky and presents to us $m$ random strings $y^{(1)}, \ldots, y^{(m)}$, along with the corresponding parities $x \cdot y^{(1)}, \ldots, x \cdot y^{(m)}$. Ignoring for a moment the fact that you could use Gaussian elimination and all of $x$ directly, let’s think about how this could help us recovering $x_i$ with the technique from before, but now with $A$ only giving the right answer on a $1/2 + 1/\poly(n)$ fraction of strings $y$.

In particular, say $A$ gives the correct answer $1/2 + 1/q(n)$ fraction of the time, where $q(n) = \poly(n)$. Then we can query $A$ on $(f(x), y^{(1)} \xor e_i), \ldots, (f(x), y^{(m)} \xor e_i)$, and from each of those, get an independent guess for $x_i$ which has a $1/2 + 1/q(n)$ chance to be correct. Then as long as $m$ is on the order of $q(n)^2$, we can recover $x_i$ with high probability. And if we push $m$ up to roughly $q(n)^2 \log n$, we could even recover all bits of $x$ correctly by repeating the process and applying a union bound!

Unfortunately, there is no magic in the world. If we tried to just randomly guess the values those $m$ parities, we would only have a $2^{-m}$ chance of succeeding, which is even worse that guessing $x$ directly.

But is there a more “efficient” way to guess a lot of parities (maybe sacrificing a little bit of their independence)? It would never have occurred to be to ask this question, but it turns out the answer is yes! What you can do is generate a small number of random strings $s^{(1)}, \ldots, s^{(\log m)}$, guess the parities $s^{(1)} \cdot x, \ldots, s^{(\log m)} \cdot x$, then let $y^{(1)}, \ldots, y^{(m)}$ be all the linear combinations of those $\log m$ strings. This means we can correctly guess the values of $m$ parities with probability $1/m$ instead of $2^{-m}$!

Now, those strings $y^{(j)}$ aren’t fully independent anymore, but they are still uniform and they’re pairwise independent. This means we can’t apply Chernoff bounds to recover $x_i$ anymore, but we can still apply Chebyshev! In particular, by setting something like $m \coloneqq 100 \cdot n \cdot q(n)^2$, we can recover each bit $x_i$ with probability $\geq 1-1/2n$, and thus recover all of $x$ with probability $1/2$.

Overall, this gives us a way to recover $x$ from $f(x)$ with probability $\geq (1/m) \cdot (1/2) \geq 1/\poly(n)$, which completes the proof!

  1. Think something like $l = n^{10}$, but here we’ll only prove $n=l+1$ (which is still nontrivial and cool but not as useful). 

  2. That is, $\set{g^0, g^1, \ldots, g^{p-2}} = \set{1, \ldots, p-1}$ modulo $p$

  3. In practice, you can just choose some $p$ that’s very close to (but less than) a power of $2$, and map any inputs greater than $p-1$ to themselves. Since there will be very few of those, it won’t undermine the one-wayness of the function by much. 

  4. Since $x_{n}$ is independent of the previous bits, the parity $x_1 \oplus \cdots \oplus x_{n}$ gives no information at all about $x_{[n-1]}$. So all we know about $x_{[n-1]}$ is $f'(x_{[n-1]})$, and therefore it is hard to guess $x_{[n-1]}$

  5. Note that this technically only works for extending even-length strings by one. But given a pseudorandom generator $g:\zo^n \to \zo^{n+1}$, it’s very easy to make a pseudorandom generator $g' : \zo^{n+1} \to \zo^{n+2}$: just add a “dummy bit” that just gets copied from the input to the output: $g'(x) \coloneqq g(x_{[n]}) \cdot x_{n+1}$. It’s clear that if you can distinguish $g'(U_{n+1})$ from $U_{n+2}$, then you can also distinguish $g(U_n)$ from $U_{n+1}$

  6. And assuming that the noise is chosen independently in each equation, which we shouldn’t assume here: we only know that $A$ succeeds with probability greater than $1/2$ on average, not independently over each input (whatever that means). 

  7. In fact, the natural generalization of this problem to $\F_q$ instead of $\F_2$ is called “learning with errors” and many cryptographic schemes assume that it is hard! Also, please don’t quote me on this, I’ve only done some shallow googling; this Wikipedia article seems relevant if you want to look into it.