8

Firstly excuse any sloppiness here -- I'm not a mathematician by training so I've had a difficult time formalizing my question and tracking down relevant material.

Consider a point in a smooth manifold, $p \in M$. Linear approximations to functions passing through that point span the first-order jet space, $J_p$ which can be interpreted as linear maps over the tangent space at that point, $T_p M$. Compositions of these functions manifests in a straightforward compositional algebra of jets where in local coordinates the corresponding Jacobians are multiplied together.

Multiplying a compositional Jacobian by an element of the tangent space gives directional derivatives. This action is exactly the action of first order forward mode automatic differentiation. We can also take the adjoint of this compositional Jacobian and multiply it by an element of the cotangent space to give gradients. This is similarly the action of first order reverse mode automatic differentiation. Everything at first order is wonderful.

My question concerns what happens for higher-order approximations to functions and the compositional structure of the corresponding higher-order jets.

At higher-orders the partial derivatives of each order will in general mix whenever two jets are composed together, and then this composite object acts on a direct product of higher-order tangent bundles. For example, at second-order the composite map eats two elements of the first-order tangent space, $u_1, u_2 \in T_p M$ and one element of $v \in T^{2}_p M$ and returns two first-order directional derivatives, $J \cdot u_1$ and $J \cdot u_2$, and one second-order directional derivative, $u_1^{T} \cdot H \cdot u_2 + J v$, where $H$ is the Hessian. There is clearly some interesting structure here where the image of the second-order jets seems to decompose into $T_p M \otimes T_p M \otimes T^{2}_p M$ but my math isn't good enough to work out the general theory or find the right references. Does anyone know what kind of algebraic structure is at work here, either for the jets or the action of the jets, or can suggest appropriate references?

The immediate follow up is then how does this structure admit adjoints? If the above gives higher-order forward mode automatic differentiation, how would we be able to take the adjoint of various subspaces of jets (or jet tangent products) to enable higher-order reverse mode automatic differentiation?

Using higher-order dual numbers I worked out the second and third-order behavior heuristically in Chapter 1 of https://github.com/stan-dev/nomad/tree/master/manual (again please excuse the mathematical sloppiness or poor notation) but I would love to have a better geometric/algebraic structure of what's going on.

Thanks!

1 Answers1

3

After fixing a coordinate system, $k$-jets can be identified with (tuples of) polynomials of degree $k$ - the $k$-jet of $f$ is represented by the $k$-th order Taylor polynomials of $f$. The composition of $k$-jets is simply the composition of the corresponding polynomials mod $(X^i)^{k+1},$ i.e. $(P \circ Q)(X)$ is just $P(Q(X))$ truncated to order $|X|^k.$

As an example, suppose we want to differentiate a composition $M \overset{f}\to M\overset \phi \to\mathbb R$ at a fixed point of $f$. The chain and product rules tells us we should end up with \begin{align} \partial_i(\phi \circ f) &=\partial_k \phi\,\partial_i f^k, \\ \partial_i \partial_j(\phi \circ f)&=\partial_k \phi\,\partial_i\partial_jf^k+ \partial_k \partial_l \phi\,\partial_if ^k\,\partial_jf^l. \end{align} (I'm using the Einstein summation convention.) The second-order Taylor polynomials of these two maps give us the coordinate representatives of their 2-jets: \begin{align}P(X)=j^2\phi(X) &= \phi_0 + \partial_k \phi \, X^k+\frac 1 2 \partial_k \partial_l \phi\,X^kX^l,\\ Q(X)=j^2 f(X)^k &= \partial_i f^k\,X^i+\frac 1 2 \partial_i \partial_j f^k\,X^iX^j. \end{align}

Now we can just substitute $Q$ into $P$, yielding \begin{multline}P(Q(X)) = \phi_0 + \partial_k \phi\,(\partial_i f^k X^i+\frac 1 2 \partial_i \partial_j f^k X^i X^j)\\ + \frac 1 2\partial_k \partial_l \phi\,(\partial_i f^k X^i+\frac 1 2 \partial_i \partial_j f^k X^i X^j)(\partial_a f^l X^a+\frac 1 2 \partial_a \partial_b f^l X^a X^b). \end{multline} Expanding, truncating to order $|X|^2$ and collecting coefficients of $X^i$ we find $$ j^2(\phi \circ f)(X)=\phi_0 + \partial_k \phi\, \partial_i f^k\,X^i + \frac12(\partial_k \phi\,\partial_i\partial_jf^k + \partial_k\partial_l\phi\,\partial_i f^k \partial_j f^l)X^iX^j.$$ Reading off the coefficients of this Taylor polynomial we see that we have arrived at the correct answer.

This "truncated polynomial algebra" is a little weird - there's no easy law to expand the composition $P\circ(Q+R),$ so it's not a ring or anything quite so nice. (Indeed, in the context of general manifolds, addition/multiplication of these polynomials does not translate to any meaningful operation on jets.) I've only ever seen it studied in the context of jets/Taylor polynomials. Lots of people care about jets, though, so there is no shortage of reading material! For a comprehensive reference, see section 12 (and maybe 13) of the freely available book Natural Operations in Differential Geometry.

You're right that we can represent any jet space by tensors: for example we have \begin{align}J^2_p(M,N)_q &\simeq L(T_pM,T_qN) \oplus L(\mathrm{Sym}^2(T_p M),T_qN) \\ &\subset (T_p M^* \otimes T_q N)\oplus(T_pM^* \otimes T_pM^* \otimes T_qN).\end{align} Of course, this identification obscures the composition structure and is coordinate-dependent, so while it's certainly useful to know, it's often more productive to be thinking about polynomials. (The polynomial representation is also coordinate-dependent, of course; but you can get a precise understanding of how they transform using jet groups.)

Regarding adjoints, I'm unsure exactly what you want - perhaps if you explain what you're looking for more thoroughly I could point you in the right direction. The relationship between velocities and covelocities is not so simple at higher order - when $k>1$ the spaces $J^k(\mathbb R,M)$ and $J^k(M, \mathbb R)$ do not even have the same dimension, so there's no direct analog of the isomorphism $$J^1_0(\mathbb R, M)=TM \simeq T^*M = J^1(M,\mathbb R)_0.$$

  • I'm with you all through the kth-order jet spaces having the algebraic structure of R[z] / z^{k + 1}. This is also how I worked out the second and third-order results heuristically, using higher-order dual numbers to implement the truncation. But where did you get the identification of $J^{2}_p(M, N)$ with tensors? I have to admit that I did attempt to get through Kolar et al last year with limited success -- any pointers to relevant material is hugely appreciated! – Michael Betancourt Jan 29 '18 at 04:58
  • Regarding adjoints, I know we lose the isomorphism at higher-orders but there seems to be some notion of invariant subspaces. For example, if you hit a higher-order jet with the right combination of elements of $T_p M$, $T^{2}_p M$, etc then you get linear maps that can be transposed to define operations on the various order cotangent spaces. – Michael Betancourt Jan 29 '18 at 05:04