1

What kind of object is the (total) second derivative of a multivariate function? Is it best seen intuitively as a multilinear map, or as the algebraic form associated with such a map?

More specifically, if we have a scalar function of two variables $f(x,y)$, we might associate its second derivative with the matrix

$$\mathbf H = \begin{bmatrix} \frac{\partial^2 f}{\partial x^2} & \frac{\partial^2 f}{\partial x \partial y} \\ \frac{\partial^2 f}{\partial x \partial y} & \frac{\partial^2 f}{\partial y^2} \end{bmatrix}.$$

We can either interpret this as representing the bilinear map $B(\mathbf u,\mathbf v)$ given by $B(\mathbf u,\mathbf v) = \mathbf u^\top \mathbf H \mathbf v$, or we can focus more on the quadratic form $Q(\mathbf u) = B(\mathbf u,\mathbf u)$.

If you read the Wikipedia entry on Total Derivatives in higher dimensions, it seems to favor the view that the $k$th derivative is a $k$-linear map. But how then do we interpret the vector inputs (in this case two of them)? It makes more sense to me to think about the Taylor expansion $$f\left(x+\Delta x,\, y+\Delta y\right) \approx f(x,y) + \frac{\partial f}{\partial x}\Delta x+\frac{\partial f}{\partial y}\Delta y+\frac{\partial^2 f}{\partial x^2}\frac{\Delta x^2}{2!}+\frac{\partial^2 f}{\partial x\partial y}\Delta x\Delta y+\frac{\partial^2 f}{\partial y^2}\frac{\Delta y^2}{2!},$$

and to see the second derivative as the quadratic form at the end (last three terms) with only a single vector input. This generalizes nicely the one-dimensional case where the second derivative can be seen as finding the closest second-order approximation to a function when we perturb its input by a small number/vector.

Is the bilinear map (multilinear map) of any use apart from the quadratic form (algebraic form) it gives rise to? The Wikipedia page says the $k$th derivative is

the "best" (in a certain precise sense) k-linear approximation to f at that point.

What does this mean intuitively? (Please aim answers at the level of a multivariate calculus/linear algebra student.) Are we now perturbing the input by multiple different vectors simultaneously? Or by a multivector of some kind?

On a related note, what about the first (total) derivative of a multivariate function? Is it a linear map or a linear form? I'm aware that for a function like $f:\mathbb R^2\to\mathbb R$ they amount to the same thing, but linear forms, quadratic forms, cubic forms, etc., make sense to me as the best first-, second-, third-order, etc. approximations to the function. On the other hand I don't know what a bilinear, trilinear, etc. map is supposed to mean intuitively.

  • This should hopefully answer some of the questions. btw the difference between linear map vs linear form isn’t that significant (linear forms are just linear maps whose target space is the underlying field… $\Bbb{R}$ in this case). – peek-a-boo Nov 05 '22 at 01:16
  • @peek-a-boo Thank you for the comment. I read your post but I feel I already understand what is written there. I am interested in how to interpret the multilinear maps. You say the second derivative eats a pair of vectors and spits out a number (in the case of a scalar function) - but what do these vectors mean and what does the spat out number mean? In practice it seems like these maps are only ever used with the same input vector repeated, what you write as $(h)^j$. Are the maps ever actually given different input vectors? – chaturanga Nov 05 '22 at 14:53
  • If $f:A\subset V\to W$, then $Df$ is a map $A\to\text{Hom}(V,W)$. So, each time you apply $D$, the target space gets larger. The second derivative is defined recursively, $D^2f:=D(Df):A\to \text{Hom}(V,\text{Hom}(V,W))$. So, $D^2f_a(h_1)\in \text{Hom}(V,W)$ is a linear map which approximates to linear order, the difference $Df_{a+{h_1}}-Df_a$. If you evaluate $D^2f_a(h_1)$ on another vector $h_2$, then this approximates $Df{a+h_1}(h_2)-Df_a(h_2)$. And so on. Roughly speaking, you’re approximating differences of differences. – peek-a-boo Nov 06 '22 at 02:27
  • For example, for $f:\Bbb{R}^n\to\Bbb{R}$, you can show (after using a canonical isomorphism to identify $D^2f_a\in \text{Hom}(\Bbb{R}^n,\text{Hom}(\Bbb{R}^n\Bbb{R}))$ with a bilinear map $D^2f_a\in\text{Hom}^2(\Bbb{R}^n;\Bbb{R})$) that $D^2f_a(e_i)(e_j)\equiv D^2f_a(e_i,e_j)=\frac{\partial^2f}{\partial x^i\partial x^j}(a)$ is the mixed $i,j$ partial. Similarly for higher derivatives, $D^kf_a(e_{i_1},\dots, e_{i_k})=\frac{\partial^kf}{\partial x^{i_1}\dots\partial x^{i_k}}(a)$. In fact $D^kf_a$ is a symmetric multilinear map, so the ordering isn’t important. – peek-a-boo Nov 06 '22 at 02:31
  • For the purposes of Taylor expansion, we only care about the same input vectors, but for other purposes, plugging in different inputs gives you other useful information (e.g the various mixed partials, as mentioned above). – peek-a-boo Nov 06 '22 at 02:32
  • 1
    @peek-a-boo Thank you!! That is exactly the intuition I was looking for! So it's like mixed second partial derivatives but with arbitrary vector changes. If you want to write up an answer I'll accept – chaturanga Nov 06 '22 at 18:41
  • Glad that was helpful; you should write up your own answer to see if you fully understood it. – peek-a-boo Nov 06 '22 at 20:36

0 Answers0