3

Let $x^*$ be a local minimum of a differentiable function $f(x)$, i.e. there exists $r>0$ such that for all $y \in B_n(x^*,r)$ we have $f(y)\ge f(x^*)$. Since $f$ is differentiable we have $$f(y)=f(x^*)+\langle f'(x^*),y-x^* \rangle + o(\parallel y-x^* \parallel) \ge f(x^*).$$ Thus, for all $\parallel s\parallel =1,$ we have $\langle f'(x^*),s\rangle =0$.

I don't understand how we get this final statement. We basically have $\langle f'(x^*),y-x^* \rangle + o(\parallel y-x^* \parallel) \ge 0$, but how does this give for all $\parallel s \parallel =1,$ we have $\langle f'(x^*),s\rangle =0$?

irchans
  • 1,995

1 Answers1

2

Part 1 I'd like to suggest a better notation:

$$f(x^{*} +\Delta x)=f(x^{*})+\langle f'(x^{*}),\Delta x \rangle + o(\parallel \Delta x \parallel) \ge f(x^{*}) \tag{1}$$

and symmetrically

$$f(x^{*} -\Delta x)=f(x^{*})+\langle f'(x^{*}),-\Delta x \rangle + o(\parallel \Delta x \parallel) \ge f(x^{*}) \tag{2}$$

In $(2)$, I use the fact that $\parallel -\Delta x \parallel=\parallel \Delta x \parallel$.


Part 2 Using the fact that $\langle x,\lambda y \rangle =\lambda\langle x, y \rangle$ then for $(1)$

$$\langle f'(x^{*}),\Delta x \rangle + o(\parallel \Delta x \parallel) \ge 0 \iff \\ \left\langle f'(x^{*}),\parallel \Delta x \parallel\frac{\Delta x}{\parallel \Delta x \parallel} \right\rangle + o(\parallel \Delta x \parallel) \ge 0 \iff \\ \left\langle f'(x^{*}),\frac{\Delta x}{\parallel \Delta x \parallel} \right\rangle + \frac{o(\parallel \Delta x \parallel)}{\parallel \Delta x \parallel} \ge 0$$ where $s=\frac{\Delta x}{\parallel \Delta x \parallel}$ and $\parallel s \parallel=1$. As $\Delta x \rightarrow 0$ we have $\frac{o(\parallel \Delta x \parallel)}{\parallel \Delta x \parallel} \rightarrow 0$, thus, by taking the limit (the sign is preserved) $$\left\langle f'(x^{*}), s \right\rangle \ge 0 \tag{3}$$ Similarly for $(2)$ we obtain $$-\left\langle f'(x^{*}), s \right\rangle \ge 0 \tag{4}$$ as a result $$\left\langle f'(x^{*}), s \right\rangle = 0$$

Note "by taking the limit" above means on the same path and direction with original $\Delta x$, i.e. $\Delta x_\alpha = \alpha \Delta x$, $\alpha >0$ and $\alpha \rightarrow 0$. In this case $\frac{\Delta x_\alpha}{\parallel \Delta x_\alpha \parallel}= \frac{\alpha \Delta x}{\parallel \alpha \Delta x \parallel}= \frac{\alpha \Delta x}{|\alpha| \parallel \Delta x \parallel}= \frac{\Delta x}{\parallel \Delta x \parallel}=s$, so that unit vector $s$ won't change with the limit.

rtybase
  • 17,398
  • 1
    @takecare You might just want to choose any norm 1 vector $s$, set $\Delta x_n= s/n$, then go through Rtybase's reasoning above while taking the limit as $n$ goes to infinity. – irchans Oct 24 '18 at 23:45