1

A few months back I managed to find an article1 presenting the Legendre transform in a satisfactory way, in essence it being

The Legendre transform $f^*$ of a (strictly) convex $C^2$ function $f$ is a "reinterpretation" of the function in terms of its slope at every point, possible due to injectivity of the convex function's derivative.

Since then I have barely touched any related material, and a few days ago I tried to reconstruct the definition on my own. Assuming $f\in C^2,f''>0$, it didn't take me long to pin the description to the following formula:

$$f^*(s)=f\left((f')^{-1}(s)\right),$$

which turned out to be wrong (I use $s$ instead of $p$ to remind myself it's the slope).

I can easily deduce involutivity and $(f^*)'=(f')^{-1}$ from the correct formula $f^*(s)=x(s)\cdot s-f(x(s))$ where $x(s)=(f')^{-1}(s)$, but I fail to see how these properties determine what the correct formula should be.
I also understand the geometric picture where $f^*(s)$ is the y-intercept of the tangent to $f$ at $x$, but that's just a visual representation of the correct formula and does not explain it any further (it is not clear why we should take the y-intercept instead of e.g. the x-intercept or the intersection with any other line).

Here are my questions. Assume $f\in C^2,f''>0$ for simplicity through the entire question, the generalization to $\max_x(sx-f(x))$ then follows easily.

  1. Why/how is my wrong formula wrong? I actually managed to partially answer this question myself, and share it as an answer below.
  2. Is there a natural way to fix the wrong formula? What reasoning can I use to deduce that I should subtract my formula specifically from $x(s)\cdot s$ to make it involutive (and not, say multiply it by $\exp(f'(x))$ or whatever instead)?
  3. Alternatively, if there's no easy way to fix it, we can start afresh - is there really only one Legendre transform? Can we deduce its formula only from the two properties, involutivity and $(f^*)'=(f')^{-1}$? In fact, are these two really defining properties for the transform?

I'm chasing classical mechanics with a look at QM and just want to have a perfectly clear sight on what exactly physicists do when they change variables and transform the Lagrangian into a Hamiltonian. The paper where I read about the "reinterpretation" of $f$ is here https://arxiv.org/abs/0806.1147. Another paper with a similar line of though is this one https://www.andrew.cmu.edu/course/33-765/pdf/Legendre.pdf, which even spends a paragraph on the wrong formula, but (sadly) then directly reveals the correct answer.


My gut feeling is that my formula does some horizontal squish-&-stretch on the graph of $f$, destroying convexity (I checked with $x\ln x$, it transforms into $(s-1)\exp(s-1)$ which is concave for $s<-1$). And somehow subtracting it from $s\cdot x(s)$ is a way to bring convexity back. Is it the only way?

Al.G.
  • 1,802

2 Answers2

1

For $f\in C^{2}\left[a,b\right],f''>0$, define $$f^{*}(s)=f\left(\left(f'\right)^{-1}(s)\right).$$ We'll find what $f$ satisfy $(f^*)^*=f$.

Now let $f\in C^{2}\left[a,b\right],f''>0$. Define the slope function $s(x):=f'(x)$ and its inverse $x(s):=\left(f'\right)^{-1}(s)$. From now on, when $x$ or $s$ are directly followed by brackets $()$, we mean them as functions, otherwise - as variables. Thus $f^{*}(s)=f(x(s))$, and a quick calculation shows $\left(f^{*}\right)'(s)=s\cdot x'(s)$ (will need it later).

Now we'll find when $f^{**}=f$. Evaluate

$$f^{**}(x) =f^{*}\left(\left(\left(f^{*}\right)'\right)^{-1}(x)\right) =f\left(x\left(\left(\left(f^{*}\right)'\right)^{-1}(x)\right)\right).$$

We want to find when the argument passed to $f$ in the last expression equals $x$ for all $x$. Transform

$$\begin{align} x\left(\left(\left(f^{*}\right)'\right)^{-1}(x)\right) &=x\\ \left(\left(f^{*}\right)'\right)^{-1}(x) &=s(x)\qquad\text{(act with $s(x)$)}\\ \text{(act with $\left(f^{*}\right)'$)}\qquad x &=\left(\left(f^{*}\right)'\right)(s(x))\\ \text{(let $x=x(s)$)}\qquad x(s) &=\left(\left(f^{*}\right)'\right)(s)\\ x(s) &=s\cdot x'(s) \end{align}$$

That is, $\dfrac{x'(s)}{x(s)}=\dfrac{1}{s}$ with solutions $x(s)=Cs,C>0$. Inverting, $s(x)=f'(x)=Cx,C>0$ and $$f(x)=Cx^{2},C>0.$$

That is, the wrong formula is involutive only on the set of positive scalar multiples of $x^2$.

Al.G.
  • 1,802
0

I found a (not too arbitrary) way to derive the Legendre transform, starting with the following assumptions: for a function $f(x)$ with an invertible derivative $p(x)=f'(x)$ (so that $x(p)$ is well-defined), find a function $g(p)$ having $g'(p)=x(p)$.

To put stress on the symmetry, we want to transform $$\array{f(x)\\p(x)=f'(x)} \Longrightarrow \array{g(p)\\x(p)=g'(p)}.$$ This way of writing it was inspired by a physics book where the goal is to transform the Lagrangian $L(q,\dot q)$ with $p=\frac{\partial L}{\partial \dot q}$ into $H(q,p)$ so that $\dot q=\frac{\partial H}{\partial p}$.

Now, given $f$, an expression for $g(p)$ would somehow depend on $f$ and $p$. Put $$g(p)=\varphi\left(f(x(p)),p\right),$$ and apply the condition $g'(p)=x(p)$. Calculate $$\begin{align} g'(p)&=\frac{d}{dp}\varphi(f(x(p)),p)\\ &=\varphi_f'\cdot f'(x(p)) \cdot \frac{dx}{dp}+\varphi'_p\cdot\\ &=\varphi_f'\cdot p\cdot \frac{dx}{dp}(p)+\varphi_p' \end{align}$$ For the last equality we used $f'(x)=p$; and $\varphi'_p,\varphi'_f$ denote the partial derivatives of $\varphi$ applied at $(f(x(p)),p)$. Applying $g'(p)=x(p)$ we get \begin{align} \varphi'_p &= x(p)-\varphi'_f\cdot p\cdot\frac{dx}{dp}\\ &= x(p)-\varphi'_f\cdot\left(\frac{d(x\cdot p)}{dp}-x(p)\right)\\ &= (1+\varphi'_f)x(p)-\frac{d(x\cdot p)}{dp} \varphi'_f. \end{align}

We'd like to integrate the RHS with respect to $p$, to get $\varphi$ (only its dependence on $p$ actually). The $(x\cdot p)'_p$ term is ready to integrate, so we're tempted to put $\varphi'_f=-1$ and eliminate the first term. Thus the above equation reduces to just $\varphi'_p= (x(p)\cdot p)'_p$. Hence $\varphi(f(x(p)),p)$ should contain a $(-f(x(p)))$ term and a $x(p)\cdot p$ term, so (ignoring possible constants of integration) $$g(p)=x(p)\cdot p-f(x(p)),$$ which is the traditional Legendre transform (for functions with an invertible derivative). In this sense, the Legendre transform is the simplest (to derive) function $g(p)$ having $x(p)=g'(p)$.

Al.G.
  • 1,802