0

I'm trying to work out the derivatives for Taylor's theorem in a specific case (Using the general framework). In fact, wikipedia derives it in my case but I think I'm getting a little sloppy with the dimensions and I can't see how to write it.

We have Taylor's theorem: $f: \mathbb{R}^n \rightarrow \mathbb{R}$: $$f(\mathbf{x}) = f(\mathbf{a}) + \sum_{\alpha=1}^k \frac{1}{\alpha !}D^{\alpha}f(\mathbf{a})(\mathbf{x} - \mathbf{a})^{\alpha} + h_k(\mathbf{x})(\mathbf{x} - \mathbf{a})^{k}$$

Here's my specific case: $g: \mathbb{R}^2 \rightarrow \mathbb{R}, \;g(x, y) = z$. Then we have $Dg(\mathbf{x}) = [g_x \: g_y]$. But isn't this also just a function $Dg: \mathbb{R}^2 \rightarrow \mathbb{R}$? And then would it not also be the case that $D(Dg): \mathbb{R}^2 \rightarrow \mathbb{R}\: ([g_{xx} \: g_{yy}])$? I guess I don't see how we get the Hessian here ($\in \mathbb{R}^{2 \times 2}$).

ChatGPT is telling me to use this "quadratic form": $$g(\mathbf{x}) \approx g(\mathbf{x_0}) + \nabla g(\mathbf{x_0})^T(\mathbf{x} - \mathbf{x_0}) + \frac{1}{2}(\mathbf{x} - \mathbf{x_0})^T H_g(\mathbf{x_0})(\mathbf{x} - \mathbf{x_0})$$

Is there a relation here to the Hessian being the transpose of the Jacobian of the gradient? Also it seems that with this form we could transpose anyway and get the Jacobian of the gradient.

Why is it that we use the Jacobian (total derivative) of the gradient (Or with the Hessian, the transpose of the Jacobian of the gradient), instead of the Jacobian (total derivative) of the gradient tranpose (first total derivative)? Would it even make sense to do that? I couldn't find answers tying all of these together yet online. (I saw some stuff about notational differences and matrix conventions but it seems unsatisfying and I think I'm still missing something). All in all, I feel like there are some basic concepts that I'm missing here, combined with maybe some more advanced generalizations that unify these concepts.

(Aside: Is there a generalization of Taylor's theorem to functions $g: \mathbb{R}^n \rightarrow \mathbb{R}^m$?)

Thanks.

rudinable
  • 229
  • 3
    Don't use ChatGPT, take a look at any real analysis book – Sine of the Time Jan 28 '25 at 22:24
  • Yeah I've taken a look at Rudin but in my analysis class we only did the first part (Not the later general differentiation section). I thought it might just be helpful to include for that point about the Hessian. – rudinable Jan 28 '25 at 22:36
  • I.e. we didn't do chapter 9 onwards. I'm hoping there might be a way to clarify a few of these things without diving back into Rudin (Which would take a little bit more time than I have at the moment). – rudinable Jan 28 '25 at 22:44
  • "isn't this also just a function $Dg: \mathbb{R}^2 \rightarrow \mathbb{R}$?" No: $Dg:\Bbb R^2\to L(\Bbb R^2,\Bbb R)\cong\Bbb R^2$. – Anne Bauval Jan 28 '25 at 23:03
  • yes, there’s a very natural generalization of Taylor’s theorem to arbitrary Banach spaces, see here and all the sublinks. – peek-a-boo Jan 28 '25 at 23:18
  • 1
    You could start by exploring that the $(x−a)^α$ is a symmetric tensor product if $x$ is a vector. If you are past that milestone, the other questions may become simpler or trivial. – Lutz Lehmann Jan 28 '25 at 23:23

0 Answers0