Given a convex function f, its convex conjugate f(a) gives the (negative) intercept of the tangent to f of slope a:

f(a)=supt(atf(t))

#figure

This gives an alternate way to describe f, not as a series of points (t,f(t)) drawing its curve, but as series of lines {atf(a)|tR} hugging its curve.

Duality

Put another way, f is minimized under the constraint that

()f(t)+f(a)at,

which shows by symmetry that f is also f’s convex conjugate: indeed, there is also no way to make f smaller without lowering one of its tangents, which would increase f(a), so f is also minimized under constraint ().

Derivatives

For a moment, suppose that f is derivable and strictly convex, which means f(t) exists and is smoothly increasing.

Consider the tangent of slope a, and suppose it touches f at t (which means dfdt(t)=a). Then intuitively, if we slightly increase the slope a, gliding the tangent further along f, the intercept will decrease at a rate proportional to t since it is a distance t away, meaning that dfda(a)=t. That is, the derivatives of f and f are inverses of each other, and in particular this shows that f is also convex.

#figure

Slightly more formally, since f is smooth, we have f(t)atf(a) for t close to t, so for a small slope increase ϵ, if we take the point t such that dfdt(t)=a+ϵ, we have

f(a+ϵ)=(a+ϵ)tf(t)f(a)+ϵtf(a)+ϵt,

which shows that dfda(a)=t as desired.