Is a path which decreases a function in the quickest way a gradient flow?

Question

Let $U \subseteq \mathbb{R}^n$ be open, and let $F:U \to \mathbb{R}$ be a smooth function. Fix a point $p \in U$, and suppose that $\nabla F(p) \neq 0$.

Let $\alpha(t)$ be a $C^{\infty}$ path starting at $p$. Suppose that $\alpha$ "beats" all $C^{\infty}$ paths starting at time $t=0$ for a short time in the following sense: For every $C^{\infty}$ path $\beta(t)$ starting at $p$ that satisfies $\|\dot \beta(t)\|=\|\dot \alpha(t)\|$, we have $F(\alpha(t)) \le F(\beta(t))$ for sufficiently small $t>0$.

(The "sufficiently small" here might depend on the path $\beta$).

Question: Must $\alpha$ be a reparametrization of the negative gradient flow of $F$, i.e. $$ \alpha(0)=p, \, \, \dot \alpha(t)=c(t)\cdot \big(-\nabla F(\alpha(t))\big) \,\, \text{where } c(t)>0 \,\,?$$

It is not hard to see that we must have $\dot \alpha(0)=-\nabla F(p)$ (up to a positive rescaling).

If we could show that $\alpha$ locally "beats" all $C^{\infty}$ paths starting at $\alpha(t)$ for sufficiently small $t>0$, then the same logic would imply the required claim.

I don't know how to "propagate" this optimality criterion from time $t=0$ to a time $t>0$.

Here is my naive attempt:

Assume by contradiction that $\alpha$ does not beat all paths at some interval $[0,\epsilon)$. Then there exist a decreasing sequence $t_n \to 0$ which demonstrates the non-optimality of $\alpha|_{[t_n,)}$ as a path starting at $\alpha(t_n)$. This means that there exist smooth paths $\beta_n:[t_n,.) \to U$, $\beta_n(t_n)=\alpha_n(t_n)$, and $s_n>t_n$ where $s_n-t_n$ is arbitrarily small, such that $F(\alpha(s_n)) > F(\beta_n(s_n))$.

Now, I guess I should somehow take the limit of the $\beta_n$ or "glue" them together to obtain a path which starts at $p=\alpha(0)$, and that beats $\alpha$.

I am not sure how to do that.

If $\alpha(t)$ is optimal for all $t < \tau$, then the optimal path starting at say $\alpha(\tau/2)$ still has to be $\alpha(t+\tau/2)$ because otherwise $\alpha$ would not be optimal until $\tau$. — quarague, Oct 22 '19 at 08:19

Dap · Answer 1 · 2019-10-30T19:42:08.290

The method of proof will be to consider a displaced path $\alpha+\delta$ where $\delta$ is a suitable normal vector field, i.e. $\dot\alpha\cdot\delta=0.$

$\nabla F$ may have zeros, so the conclusion will be that $\dot\alpha(t)$ is of the form $-c(t)\nabla F(\alpha(t))$ for sufficiently small $t$ whenever $\nabla F(\alpha(t))\neq 0.$

Reparameterize $\alpha$ by arc length so $\|\dot\alpha\|=1.$

I'll use $n$ as an index, not the dimension of the space $\alpha(t)$ lives in.

Step 1: for small $t,$ either $F(\alpha(t))$ is constant or $\alpha$ must be a straight line

Suppose not for contradiction.

For a certain decreasing sequence $t_n\to 0,$ we will try to slightly shorten $\alpha$ for $t\in (t_{n+1},t_n).$ This will depend on curvature, a bit like curve-shortening flow.

I will deal with the case that $F\circ \alpha$ is not locally non-decreasing at zero i.e. for every $\tau>0$ there is $0<t<\tau$ such that $F(\alpha(t))<F(p).$ In this case we certainly have $F(\alpha(t))=F(p)-\mu$ for some $t>0$ and some $\mu>0.$ By looking at the first time $F(\alpha(t))$ hits $F(p)-\mu/n$ we get a strictly decreasing sequence $t_n\to 0$ such that $F(\alpha(t))>F(\alpha(t_n))$ for all $n$ and $0\leq t<t_n.$ For the case where $F\circ\alpha$ is locally non-decreasing at zero, I think you can run the following argument with reversed signs. (I can try to give more detail if this case is important.)

For each $n$ define $\delta$ on $(t_{n+1},t_n)$ as follows. If $\ddot\alpha$ is identically zero in $(t_{n+1},t_n),$ set $\delta(t)=0$ in $(t_{n+1},t_n).$ Otherwise, pick a smooth function $\psi$ that is positive on $(t_{n+1},t_n)$ and all derivatives tend to zero at the endpoints, and take $\delta(t)$ to be $\ddot\alpha(t)\psi(t)\epsilon$ where $\epsilon$ is small enough to ensure:

the length of $\alpha+\delta$ from $t_{n+1}$ to $t_n$ is less than $t_n-t_{n+1},$ and
$\left\|\frac{d^k}{dt^k}\ddot\alpha(t)\psi(t)\right\|\epsilon<1/n$ for all $(t_{n+1},t_n)$ and all $0\leq k\leq n,$ and
$\dot\alpha+\dot\delta\neq 0$ for all $(t_{n+1},t_n).$

To see that the first condition holds for small $\epsilon,$ differentiate $\dot\alpha\cdot\delta=0$ to get $\dot\alpha\cdot\dot\delta=-\ddot\alpha\cdot\delta.$ The length $L_n$ of $\alpha+\delta$ from $t_{n+1}$ to $t_n$ is therefore $$\int_{t_{n+1}}^{t_n}\sqrt{\|\dot\alpha+\dot\delta\|^2}=\int_{t_{n+1}}^{t_n} 1-\ddot\alpha\cdot\delta+O(\epsilon^2)\tag{1}$$

which is strictly less than $t_n-t_{n+1}$ for small $\epsilon,$ unless $\ddot\alpha$ is identically zero on $(t_{n+1},t_n),$ in which case $L_n=t_n-t_{n+1}.$ This ensures that unless $\ddot\alpha$ is identically zero on $(0,t_n)$ (in which case $\alpha$ is a straight line there), then the length of $\alpha+\delta$ up to time $t_n$ is less than $t_n.$

The second condition ensures that all derivatives of $\delta$ tend to zero at $0.$

Take $\beta$ to be the path $\alpha+\delta,$ reparameterized by arc length. Then $F(\beta(L_n))=F(\alpha(t_n))<F(\alpha(L_n))$ (using the definition of $t_n$). So $\alpha$ does not beat $\beta,$ a contradiction.

Step 2: $\alpha$ follows negative gradients where possible, for small $t$

Suppose not for contradiction. By the previous step, $\alpha$ is a straight line for small $t.$

We will move $\alpha$ slightly in the direction of the negative gradient of $F.$

Let $t_n$ be a strictly decreasing sequence tending to $0$ such that $\nabla(F(\alpha(t_n))\neq 0$ and $\dot \alpha(t_n)$ is not of the form $-c(t_n)\nabla F(\alpha(t_n)).$

By induction on $n$ construct values $\xi_n>0$ and a function $\delta_n$ supported on $(t_{2n+2},t_{2n})$ as follows. Pick a smooth function $\psi$ that is positive on $(t_{2n+2},t_{2n})$ and all derivatives tend to zero at the endpoints, and take $\delta_n(t)$ to be

$$\delta_n(t)=(-\nabla F(\alpha(t))+\dot\alpha(t)(\dot\alpha(t)\cdot\nabla F(\alpha(t))))\psi(t)\epsilon$$

where $\epsilon$ will be chosen later. Let $\beta_n$ be the arc-length parameterized version of $\alpha+\sum_{m=1}^n\delta_m.$ Choose $\epsilon$ small enough to ensure:

$\xi_n:=F(\alpha(t_{2n+1}))-F(\beta_n(t_{2n+1}))>0$
$F(\alpha(t_{2m+1}))-F(\beta_n(t_{2m+1}))>\xi_m/2$ for $1\leq m<n$
$\left\|\frac{d^k\delta_n}{dt^k}\right\|\epsilon<1/n$ for all $(t_{2n+2},t_{2n})$ and all $0\leq k\leq n,$ and
$\dot\alpha+\dot{\delta_n}\neq 0$ for all $(t_{2n+2},t_{2n})$ (so $\beta_n$ actually makes sense).

The first condition holds for small enough $\epsilon$ because the change in arc-length is $O(\epsilon^2),$ using the same calculation as in (1) and $\ddot\alpha=0,$ while the change in $F(\alpha(t)+\delta_n(t))$ is $\Theta(\epsilon).$ The second condition is "open" so holds in an open neighborhood of $\epsilon=0.$

$\alpha$ will then not beat the arc-length parameterized version of $\alpha+\sum_{n=1}^\infty \delta_n,$ a contradiction.

Hi, I have (finally)... came back to thoroughly examine the details of your approach, and I must admit that I am a bit confused regarding what is your general strategy. Are you first trying to establish that if the optimal path is not the negative gradient flow, then it must be a straight line? — Asaf Shachar, Jan 23 '20 at 14:01

Is a path which decreases a function in the quickest way a gradient flow?

1 Answers1

Step 1: for small $t,$ either $F(\alpha(t))$ is constant or $\alpha$ must be a straight line

Step 2: $\alpha$ follows negative gradients where possible, for small $t$