The method of proof will be to consider a displaced path $\alpha+\delta$ where $\delta$ is a suitable normal vector field, i.e. $\dot\alpha\cdot\delta=0.$
$\nabla F$ may have zeros, so the conclusion will be that $\dot\alpha(t)$ is of the form $-c(t)\nabla F(\alpha(t))$ for sufficiently small $t$ whenever $\nabla F(\alpha(t))\neq 0.$
Reparameterize $\alpha$ by arc length so $\|\dot\alpha\|=1.$
I'll use $n$ as an index, not the dimension of the space $\alpha(t)$ lives in.
Step 1: for small $t,$ either $F(\alpha(t))$ is constant or $\alpha$ must be a straight line
Suppose not for contradiction.
For a certain decreasing sequence $t_n\to 0,$ we will try to slightly shorten $\alpha$ for $t\in (t_{n+1},t_n).$ This will depend on curvature, a bit like curve-shortening flow.
I will deal with the case that $F\circ \alpha$ is not locally non-decreasing at zero i.e. for every $\tau>0$ there is $0<t<\tau$ such that $F(\alpha(t))<F(p).$ In this case we certainly have $F(\alpha(t))=F(p)-\mu$ for some $t>0$ and some $\mu>0.$ By looking at the first time $F(\alpha(t))$ hits $F(p)-\mu/n$ we get a strictly decreasing sequence $t_n\to 0$ such that $F(\alpha(t))>F(\alpha(t_n))$ for all $n$ and $0\leq t<t_n.$
For the case where $F\circ\alpha$ is locally non-decreasing at zero, I think you can run the following argument with reversed signs. (I can try to give more detail if this case is important.)
For each $n$ define $\delta$ on $(t_{n+1},t_n)$ as follows.
If $\ddot\alpha$ is identically zero in $(t_{n+1},t_n),$ set $\delta(t)=0$ in $(t_{n+1},t_n).$ Otherwise, pick a smooth function $\psi$ that is positive on $(t_{n+1},t_n)$ and all derivatives tend to zero at the endpoints, and take $\delta(t)$ to be $\ddot\alpha(t)\psi(t)\epsilon$ where $\epsilon$ is small enough to ensure:
- the length of $\alpha+\delta$ from $t_{n+1}$ to $t_n$ is less than $t_n-t_{n+1},$ and
- $\left\|\frac{d^k}{dt^k}\ddot\alpha(t)\psi(t)\right\|\epsilon<1/n$ for all $(t_{n+1},t_n)$ and all $0\leq k\leq n,$ and
- $\dot\alpha+\dot\delta\neq 0$ for all $(t_{n+1},t_n).$
To see that the first condition holds for small $\epsilon,$ differentiate $\dot\alpha\cdot\delta=0$ to get $\dot\alpha\cdot\dot\delta=-\ddot\alpha\cdot\delta.$
The length $L_n$ of $\alpha+\delta$ from $t_{n+1}$ to $t_n$ is therefore
$$\int_{t_{n+1}}^{t_n}\sqrt{\|\dot\alpha+\dot\delta\|^2}=\int_{t_{n+1}}^{t_n} 1-\ddot\alpha\cdot\delta+O(\epsilon^2)\tag{1}$$
which is strictly less than $t_n-t_{n+1}$ for small $\epsilon,$ unless $\ddot\alpha$ is identically zero on $(t_{n+1},t_n),$ in which case $L_n=t_n-t_{n+1}.$
This ensures that unless $\ddot\alpha$ is identically zero on $(0,t_n)$ (in which case $\alpha$ is a straight line there), then the length of $\alpha+\delta$ up to time $t_n$ is less than $t_n.$
The second condition ensures that all derivatives of $\delta$ tend to zero at $0.$
Take $\beta$ to be the path $\alpha+\delta,$ reparameterized by arc length. Then $F(\beta(L_n))=F(\alpha(t_n))<F(\alpha(L_n))$ (using the definition of $t_n$). So $\alpha$ does not beat $\beta,$ a contradiction.
Step 2: $\alpha$ follows negative gradients where possible, for small $t$
Suppose not for contradiction. By the previous step, $\alpha$ is a straight line for small $t.$
We will move $\alpha$ slightly in the direction of the negative gradient of $F.$
Let $t_n$ be a strictly decreasing sequence tending to $0$ such that $\nabla(F(\alpha(t_n))\neq 0$ and $\dot \alpha(t_n)$ is not of the form $-c(t_n)\nabla F(\alpha(t_n)).$
By induction on $n$ construct values $\xi_n>0$ and a function $\delta_n$ supported on $(t_{2n+2},t_{2n})$ as follows.
Pick a smooth function $\psi$ that is positive on $(t_{2n+2},t_{2n})$ and all derivatives tend to zero at the endpoints, and take $\delta_n(t)$ to be
$$\delta_n(t)=(-\nabla F(\alpha(t))+\dot\alpha(t)(\dot\alpha(t)\cdot\nabla F(\alpha(t))))\psi(t)\epsilon$$
where $\epsilon$ will be chosen later.
Let $\beta_n$ be the arc-length parameterized version of $\alpha+\sum_{m=1}^n\delta_m.$ Choose $\epsilon$ small enough to ensure:
- $\xi_n:=F(\alpha(t_{2n+1}))-F(\beta_n(t_{2n+1}))>0$
- $F(\alpha(t_{2m+1}))-F(\beta_n(t_{2m+1}))>\xi_m/2$ for $1\leq m<n$
- $\left\|\frac{d^k\delta_n}{dt^k}\right\|\epsilon<1/n$ for all $(t_{2n+2},t_{2n})$ and all $0\leq k\leq n,$ and
- $\dot\alpha+\dot{\delta_n}\neq 0$ for all $(t_{2n+2},t_{2n})$ (so $\beta_n$ actually makes sense).
The first condition holds for small enough $\epsilon$ because the change in arc-length is $O(\epsilon^2),$ using the same calculation as in (1) and $\ddot\alpha=0,$ while the change in $F(\alpha(t)+\delta_n(t))$ is $\Theta(\epsilon).$ The second condition is "open" so holds in an open neighborhood of $\epsilon=0.$
$\alpha$ will then not beat the arc-length parameterized version of $\alpha+\sum_{n=1}^\infty \delta_n,$ a contradiction.