5

Suppose an absolutely continuous curve $\mu \colon (0, \infty) \to P_2(\Omega)$, where $P_2$ is the Wasserstein-2-space, fulfils the continuity equation $$ \label{eq:CE} \tag{CE} \partial_t \mu_t = \text{div}(\mu_t g_{\mathfrak h \mu_t}) $$ for almost all $t > 0$ in the weak sense, where $\Omega := \Theta \times \mathbb R_{\ge 0}$, $\Theta$ is a compact connected Riemannian manifold without boundary and $$ g_{v} \colon \Omega \to T \Omega, \qquad (r, \theta) \mapsto \begin{pmatrix} 2 \alpha r \cdot J_{v}'(\theta) \\ \beta \cdot \nabla J_{v}'(\theta) \end{pmatrix} \in \mathbb R \times T_{\theta} \Theta $$ for some fixed $\alpha, \beta > 0$ and $v \in M_+(\Theta)$ is a finite non-negative Radon measure on $\Theta$ and $J_v' \colon \Theta \to \mathbb R$ is differentiable. Show that (this is Proposition 2.1 in Lenaic Chizat's Sparse optimization on measures with overparametrized gradient descent) $v_t := \mathfrak h \mu_t$ fufills $$ \label{eq:advec} \tag{$\ddagger$} \partial_t v_t = -4 \alpha v_t J_{v_t}' + \beta \cdot \text{div}(v_t \nabla J_{v_t}') $$ in the weak sense for almost every $t > 0$, where $\mathfrak h \colon P_2(\Omega) \to M_+(\Theta)$ is defined via $$ \label{eq:h} \tag{$\dagger$} \int_{\Theta} \psi(\theta) \, \text{d}(\mathfrak h \mu)(\theta) = \int_{\Omega} \psi(\theta) r^2 \, \text{d}\mu(\theta, r) \qquad \forall \psi \in \mathcal C(\Theta; \mathbb R). $$ The metric on $\Omega$ is $$ \label{eq:omega} \tag{$\star$} \langle (r_1, \partial \theta_1), (r_2, \partial \theta_2) \rangle_{(r, \theta)} := \frac{1}{\alpha} r_1 r_2 + \frac{r^2}{\beta} \langle \partial \theta_1, \partial \theta_2 \rangle_{\theta} $$ for $x = (r, \theta) \in \Omega$ and $(r_1, \partial \theta_1), (r_2, \partial \theta_2) \in T_x \Omega \cong \mathbb R \times T_{\theta} \Theta$.

My attempts. Let $\xi \colon \Theta \to \mathbb R$ be differentiable. Then for almost all $t > 0$ \begin{align} \frac{\text{d}}{\text{d} t} \int_{\Theta} \xi(\theta) \, \text{d}v_t(\theta) & \overset{\eqref{eq:h}}{\underset{v_t=\mathfrak h \mu_t}{=}} \frac{\text{d}}{\text{d} t} \int_{\Omega} \xi(\theta) r^2 \, \text{d}\mu_t(\theta, r) \\ & \overset{\eqref{eq:CE}}{=} - \int_{\Omega} \left\langle \begin{pmatrix} \nabla \xi(\theta) \\ 2 r \end{pmatrix}, \begin{pmatrix} \beta \cdot \nabla J_{v_t}'(\theta) \\ 2 \alpha r \cdot J_{v_t}'(\theta) \end{pmatrix} \right\rangle_{(\theta, r)} \, \text{d}\mu_t(\theta, r) \\ &\overset{\eqref{eq:omega}}{=} - \int_{\Omega} \frac{1}{\alpha} (2 r) \cdot 2 \alpha r \cdot J_{v_t}'(\theta) + \frac{r^2}{\beta} \beta \langle \nabla J_{v_t}'(\theta), \nabla \xi(\theta) \rangle_{\theta} \, \text{d}\mu_t(\theta, r) \\ & = - \int_{\Omega} r^2 \cdot \left( 4 J_{v_t}'(\theta) + \langle \nabla J_{v_t}'(\theta), \nabla \xi(\theta) \rangle_{\theta}\right) \, \text{d}\mu_t(\theta, r) \\ & \overset{\eqref{eq:h}}{\underset{v_t=\mathfrak h \mu_t}{=}} - \int_{\Theta} 4 J_{v_t}'(\theta) + \langle \nabla J_{v_t}'(\theta), \nabla \xi(\theta) \rangle_{\theta} \, \text{d}v_t(\theta), \end{align} because the weak formulation of \eqref{eq:CE} is $$ \frac{\text{d}}{\text{d} t} \int_{\Omega} \psi(x) \, \text{d}\mu_t(x) = - \int_{\Omega} \langle \nabla \psi(x), g_{\mathfrak{h} \mu_t}(x) \rangle_{x} \, \text{d}\mu_t(x) $$ for all differentiable maps $\psi \colon \Omega \to \mathbb R$.

But this doesn't look like the weak formulation of \eqref{eq:advec}, which should be something like $$ \frac{\text{d}}{\text{d} t} \int_{\Theta} \xi(\theta) \, \text{d}v_t(\theta) = - \int_{\Theta} 4 \alpha \xi(\theta) J_{v_t}'(\theta) + \beta \langle \nabla \xi(\theta), \nabla J_{v_t}'(\theta) \rangle_{\theta} \, \text{d}v_t(\theta), $$ because the $\alpha$ and $\beta$ are missing. Is my calculation wrong (I am particularly sure about the computation of the gradient of $\xi(\theta) \cdot r^2$) or did I compute the weak formulation of \eqref{eq:advec} wrong?

ViktorStein
  • 5,024

1 Answers1

1

The weak formulation is correct, the mistake is in the computation of the gradient of $\psi(\theta, r) := \xi(\theta) r^2$, which depends on the metric on $\Omega$ and is given by (for a derivation see this answer) $$\nabla \psi(\theta, r) = (\beta \nabla \xi(\theta), 2 \alpha r\xi(\theta)\big).$$

Hence the correct computation is \begin{align} \frac{\text{d}}{\text{d} t} \int_{\Theta} \xi(\theta) \, \text{d}v_t(\theta) & \overset{(\dagger)}{\underset{v_t=\mathfrak h \mu_t}{=}} \frac{\text{d}}{\text{d} t} \int_{\Omega} \xi(\theta) r^2 \, \text{d}\mu_t(\theta, r) \\ & \overset{\text{(CE)}}{=} - \int_{\Omega} \left\langle \begin{pmatrix} \beta \nabla \xi(\theta) \\ 2 \alpha r \xi(\theta)\end{pmatrix}, \begin{pmatrix} \beta \cdot \nabla J_{v_t}'(\theta) \\ 2 \alpha r \cdot J_{v_t}'(\theta) \end{pmatrix} \right\rangle_{(\theta, r)} \, \text{d}\mu_t(\theta, r) \\ &\overset{(\star)}{=} - \int_{\Omega} \frac{1}{\alpha} \xi(\theta) (2 r) \cdot 2 \alpha^2 r \cdot J_{v_t}'(\theta) + \frac{r^2}{\beta} \beta^2 \langle \nabla J_{v_t}'(\theta), \nabla \xi(\theta) \rangle_{\theta} \, \text{d}\mu_t(\theta, r) \\ & = - \int_{\Omega} r^2 \cdot \left( 4 \alpha \xi(\theta) J_{v_t}'(\theta) + \beta \langle \nabla J_{v_t}'(\theta), \nabla \xi(\theta) \rangle_{\theta}\right) \, \text{d}\mu_t(\theta, r) \\ & \overset{(\dagger)}{\underset{v_t=\mathfrak h \mu_t}{=}} - \int_{\Theta} 4 \alpha \xi(\theta) J_{v_t}'(\theta) + \beta\langle \nabla J_{v_t}'(\theta), \nabla \xi(\theta) \rangle_{\theta} \, \text{d}v_t(\theta), \end{align} for almost all $t > 0$, as desired.

ViktorStein
  • 5,024