Why don't we go beyond the Hessian in multivariate optimization?

Question

In univariate optimization, we perform the first derivative test to identify stationary points and the second derivative test to classify the stationary points as minima, maxima and inconclusive. When both $f'(x)$ and $f''(x)$ are zero, we go for higher-order derivative test to find the nature of the stationary point. However, in case of multivariate optimization we don't go beyond the Hessian second derivative test.

Why?

who says we don't? It may be just poorly documented and rarely used cause it's cumbersome to do in practice, but there is no principal reason not to. — Thomas, Sep 11 '14 at 09:31
The main reason is that higher order derivatives can't be handled easily. They are represented by tensors, and it might be almost impossible to study "the sign" of the fifth derivative. — Siminore, Sep 11 '14 at 09:48
See my answer here: https://math.stackexchange.com/a/2870187/71829 — Keshav Srinivasan, Aug 03 '18 at 05:47

score 2 · Accepted Answer · answered Sep 11 '14 at 19:45

The reference for this is simply Taylor's formula.

Assume, for simplicity, that $f:\mathbb{R}^n\rightarrow \mathbb{R}$ is smooth ($C^\infty$).

The kind of reasoning you are referring to is based on the fact (which you will find in one or the other form in any textbook on real analysis) that then, for any $k\in \mathbb{N}$ and any vector $h\in \mathbb{R}^n$ $$f(x+h) = \sum\limits_{|\alpha|\le k}\frac{D^\alpha f(x)}{\alpha !}h^\alpha + \sum\limits_{|\alpha| = k+1}\frac{D^\alpha f(x+\theta h)}{\alpha !}h^\alpha $$ Here, $\alpha =(\alpha_1,\ldots. \alpha_n) $ is a multi index, $|\alpha| = \sum_i \alpha_i $, $\,\,\, \alpha ! = \alpha_1! \ldots\alpha_n!$
$$D^\alpha f(x) = \frac{\partial^{|\alpha|}f}{\partial x_1^{\alpha_1}\cdots\partial x_n^{\alpha_1}}(x)$$

and $\theta$ some number in $(0,1)$. (As in the one dimensional case there are several different means to express the remainder). If the notation is strange to you and the formula not known I suggest you look it up in a textbook. If you know higher dimensional calculus and the Taylor formula in one dimension it is actually rather straightforward to show this.

Now the general reasoning is straightforward in principle: assume that the first $k-1$ derivatives of $f$ vanish in $x$ and that $k$ is the first index for which some derivative of that order does not vanish.

Then, by the formula, the behavior of $f$ near $x$ is, at least in the combination of some directions, governed by the $k$-th derivative.

One of the problems here is the little restriction 'in some direction'. It may well be (in the simplest case) that some directional derivative of order $k$ is not zero while some other directional derivate of that order is -- then the behaviour in that direction is governed by even higher order derivatives. And this does not take into account the combinatorial problems which arise from combinations of directions.

So, in principle, yes, you can say something about extreme points by using the same approach as in one dimension, but the resulting formulae are too complicated to formulate a general result. Of course you could say that, if all derivatives of order less than $k$ vanish in $x$ and if, say, $$ \sum\limits_{|\alpha|= k}\frac{D^\alpha f(x)}{\alpha !}h^\alpha \ge |h|^{k} $$ for each small nonzero $h$, then you have a minimum. But this is some situation which you will rarely encounter, and if you do, you can just refer to Taylor.

Okay, I did not understand all of it, but get the point you are trying to make. Thank you for taking out time to answer. — Prometheus, Sep 11 '14 at 20:30

Why don't we go beyond the Hessian in multivariate optimization?

1 Answers1