4

The Legendre transform of convex functions $\mathbb R^n\to\mathbb R^n$ can be given a nice geometric interpretation as a way to characterise a function via the set of the tangent spaces to its graph.

The Wikipedia page mentions that this idea can be generalised to define the Legendre transform of smooth functions $L:E\to\mathbb R$ for a generic vector bundle $\pi:E\to M$ with $M$ a smooth manifold. I'm trying to understand how (or if?) this definition connects with the geometric one mentioned above.

In the Wiki page, they define the Legendre transform of $L:E\to \mathbb R$ as the smooth morphism $$\mathbf FL:E\to E^*,$$ such that, for all $v\in E$, $$\mathbf FL(v) \equiv \mathrm d(L|_{E_x})(v),$$ where $L|_{E_x}:E_x\to\mathbb R$ is the restriction of $L$ to the fiber $E_x$ over $x\in M$ such that $x=\pi(v)$ (so that $v\in E_x$). As pointed out in the Wikipedia page, this means that $$(\mathbf FL(v)) (w) = (\mathrm d(L|_{E_x})(v))(w) = \partial_t|_0 L(v + tw)\in\mathbb R.$$

There is then some more explanation about what this looks like in local coordinates: $$\mathbf FL(x;v_1,...,v_r) = (x; p_1,...,p_r), \quad\text{where}\quad p_i = \frac{\partial L}{\partial v_i}(x; v_1,...,v_r).$$

However, I can't see how this relates (if it does at all) to the intuition in the simple case of looking at tangent planes to the graph of the function. Sure, given $f:\mathbb R\to\mathbb R$, to find $\max_x (px-f(x))$ we compute $p=f'(x_0)$, but the Legendre transform is then obtained replacing $x_0=x_0(p)$ in $px_0-f(x_0)$, which we are not doing here at all it looks like.

glS
  • 7,963

1 Answers1

4

There's a slight mismatch in terminology as you have observed. The term "Legendre transform" is being used for two things with different meanings.

Say $X,Y$ are two vector bundles over $Q$ and $f:X\to Y$ is a smooth fiber-preserving map (not necessarily fiberwise linear). Then, we can define the object $\mathbf{F}f:X\to \text{Hom}(X,Y)$, by setting for each $x\in Q$ and $v\in X_x$, \begin{align} (\mathbf{F}f)(v):= D\left(f|_{X_x}\right)(v)\in \text{Hom}(X_x,Y_x)=\text{Hom}(X,Y)_x \end{align} One can show this is a smooth fiber-bundle morphism. I would prefer to call this object the Fiber Derivative. This is really the appropriate terminology because we're restricting $f$ to the fiber to get the mapping $f|_{X_x}:X_x\to Y_x$ between vector spaces, and we're taking the usual derivative of such an object.

Now, to such a morphism $f:X\to Y$, we can define another morphism $E_f:X\to Y$ as \begin{align} E_f(v):= (\mathbf{F}f(v))(v) - f(v) \end{align} The reason for the symbol $E$ is that it's kind of like the "energy mapping associated to $f$".

Now, suppose that the fiber derivative $\mathbf{F}f:X\to \text{Hom}(X,Y)$ is a fiber-bundle isomorphism (for which it is necessary that $Y$ have one-dimensional vector spaces as its fibers, so that $X_x$ and $\text{Hom}(X,Y)_x$ have the same vector space dimension). In this case, we can consider the mapping $\lambda_f:= E_f\circ (\mathbf{F}f)^{-1}:\text{Hom}(X,Y)\to Y$. Classically, it is this mapping $\lambda_f$ which is called the Legendre-transform of $f$. So, given the function $f$, we consider its energy ($E_f$), and then change variables (compose with $(\mathbf{F}f)^{-1}$).


As a special case, suppose $L:TQ\to \Bbb{R}$ is a smooth function (the Lagrangian, which we can trivially think of as a fiber-bundle map $TQ\to Q\times \Bbb{R}$, $v\mapsto (x,f(v))$, hence everything above can be applied). Then the fiber derivative is $\mathbf{F}L:TQ\to T^*Q$; in terms of bundle coordinates, it is \begin{align} (x^1,\dots, x^n,v^1,\dots, v^n)\mapsto \left(x^1,\dots, x^n; \frac{\partial L}{\partial v^1}(x,v),\dots, \frac{\partial L}{\partial v^n}(x,v)\right) \end{align} Now, the energy function is $E=E_L:TQ\to\Bbb{R}$, which by unwinding the definitions, can be written in coordinates as \begin{align} (x,v)\mapsto v^i\frac{\partial L}{\partial v^i}(x,v)-L(x,v) \end{align}

Or if we resort to the classical notation $(q^1,\dots, q^n,\dot{q}^1,\dots, \dot{q}^n)$ for the coordinates on $TQ$, then \begin{align} E&=\dot{q}^i\frac{\partial L}{\partial \dot{q}^i}-L, \end{align} which is a formula for the energy of a Lagrangian system, which is present in one of the first few pages of Landau and Lifshitz for example.

Now, if we make the assumption that the fiber derivative $\mathbf{F}L:TQ\to T^*Q$ is a diffeomorphism (typically called a hyperregular Lagrangian), then we can consider the function $H:= E\circ (\mathbf{F}L)^{-1}:T^*Q\to\Bbb{R}$, and this is what we call the Hamiltonian function associated to the Lagrangian $L$. It is this function $H$ (defined on a completely different space) that is usually referred to as "the Legendre transform of $L$" (in coordinates, people often write $H=\dot{q}^ip_i-L$).

So, there's two things to distinguish: the first is the fiber derivative, the second is the Legendre transformation (which is the composition of the "energy" by the inverse of the fiber-derivative). Often though, people may use "Legendre transform" to mean both these things.

peek-a-boo
  • 65,833
  • thanks. I realise this was also in my question, but I now realise that I'm not sure what is meant with the differential in this context. More specifically, here $f|{X_x}:X_x\to Y$. Taking the differential I'd expect $D(f|{X_x}):TX_x\to TY$ to be the map between the tangent spaces. Though I suppose this might not be well-defined if the total space isn't a manifold itself? Is there a special definition of the differential for vector bundles? – glS Sep 01 '21 at 14:41
  • @glS If $V,W$ are Banach spaces and $\phi:V\to W$ is a map, we can talk about its Frechet derivative ($\phi$ is Frechet differentable at a point $a\in V$ is there exists a continuous linear map $T:V\to W$ such that $\frac{|\phi(a+h)-\phi(a)-T(h)|}{|h|}\to 0$ as $h\to 0$. In which case, we can prove $T$ is unique and thus denote it as $Df_a$ or $Df(a)$ or any one of a billion other symbols). Here, $f|{X_x}:X_x\to Y_x$ is a smooth map between Banach spaces, hence we can consider its Frechet derivative $D(f|{X_x})_{\xi}\in \text{Hom}(X_x,Y_x)$. – peek-a-boo Sep 01 '21 at 14:54
  • Also, as a side remark: since $X_x$ and $Y_x$ are vector spaces, their tangent spaces at the point $\xi\in X_x$ are canonically isomorphic to the vector space itself: $T_{\xi}(X_x)\cong X_x$ and likewise for $Y$ (this is why the tangent bundle of an open subset of vector space is a trivial vector bundle). If we let $\iota_1:T_{\xi}(X_x)\to X_x$ and likewise $\iota_2$ be the canonical isomorphisms, then by denoting $T_{\xi}(f|{X_x}):T{\xi}(X_x)\to T_{\xi}(Y_x)$ to be the tangent map between tangent spaces, it follows by unwinding the definitions that (cont) – peek-a-boo Sep 01 '21 at 14:57
  • the tangent map is isomorphically-related to the Frechet derivative as follows: $T_{\xi}(f|{X_x})= \iota_2\circ D(f|{X_x}){\xi}\circ \iota_2^{-1}$ (i.e a certain diagram commutes). Also, I just notices you wrote $f|{X_x}:X_x\to Y$; but note that since $f$ is a fiber-preserving map, we can actually shrink the target space to $Y_x$ as well (a fact which I heavily invoked). Smoothness of $f:X\to Y$ implies that of the restriction $f|_{X_x}:X_x\to Y_x$ because the fibers are embedded submanifolds of the total space (so smoothness of restrictions is a standard theorem). – peek-a-boo Sep 01 '21 at 14:59
  • (and one other remark: every Banach space can be considered as a manifold, modelled over itself, simply by considering the maximal atlas containing the identity chart. So, things like $\Bbb{R}^n$, $M_{m\times n}(\Bbb{R})$, etc are all examples of smooth manifolds. Furthermore, smoothness in the sense of Frechet, in the Banach space context is equivalent to smoothness in the manifold sense, simply because the manifold structure is defined using the identity chart... so everything fits together nicely). – peek-a-boo Sep 01 '21 at 15:08
  • ah, I understand now. Thanks for the clarifications. Another thing; do you know if, in all this, the geometric interpretation in the original formulation is preserved? I mean if there is a meaningful way to interpret this in terms of tangent planes to some graph. – glS Sep 01 '21 at 16:14
  • @glS whatever intuition you have in the $\Bbb{R}^n\to \Bbb{R}$ case, just imagine that the domain is an arbitrary vector space $V$. Now for the vector bundle case, we're essentially doing the basic Legendre-transform fiber by fiber, so just apply your intuition fiber by fiber. – peek-a-boo Sep 01 '21 at 22:10