4

For any convex, proper and closed function $f$ and for any $x$, the Moreau decomposition states that
$$Prox_f(x)+Prox_{f^*}(x)=x,$$ where $f^*$ is the conjugate function of $f$ and $Prox_f$ is the proximal operator of $f$ defined as $$Prox_f(x)=\underset{v}{\arg\min}\;\frac{1}{2}||x-v||^2+f(v).$$ My question is whether this decomposition holds even when $f$ is not convex, assuming that $Prox_f(x)$ is well-defined. I know that $f^*$ is convex regardless to the convexity of $f$, hence, it should hold that $$Prox_{f^{**}}(x)+Prox_{f^*}(x)=x,$$ where $f^{**}$ is the biconjugate of $f$. Thus, my question reduces to whether $Prox_f=Prox_{f^{**}}$ when $f$ is not convex?

Thank you.

  • Your observation is indeed helpful. Also note that $f^{**}$ is the convex envelope of $f$ and equals $f$ if and only if $f$ is convex and lower semicontinuous. This does not answer the question since there are nonconvex $f$ for which the convex envelope has the same prox... – Dirk Oct 18 '20 at 09:25

1 Answers1

3

Interesting. $\newcommand{\prox}{\mathrm{prox}}\newcommand\inner[1]{\left\langle #1 \right\rangle}$

Note that if $f$ is nonconvex then $\prox_f(x)$ might not be a singleton, thus the question of interest becomes whether the following holds: \begin{equation} x \in \prox_f(x) + \prox_{f^*}(x).\tag{*}\label{moreau} \end{equation} With some further assumptions on $f$ (and $x$), the answer is affirmative.

Denote $d_x(y) = \frac{1}{2}\|y-x\|^2$. We have (see \eqref{optimality} below): $$z\in \prox_f(x) \iff z\in \arg\min_{y} \left\{f(y) + d_x(y)\right\} \iff 0\in\partial(f + d_x)(z).$$ Since $f$ is nonconvex, the inclusion $\partial f(z) + \partial d_x(z) \subset \partial(f + d_x)(z)$ may be proper, e.g. when $\partial f(z) = \emptyset$. Clearly, for any $z\in \prox_f(x)$, the set $\partial(f + d_x)(z)$ is non-empty because it contains $0$. Now, if we assume that there exists $z$ such that the element $0$ belongs to the subset $\partial f(z) + \partial d_x(z)$, then \eqref{moreau} holds. Note that this assumption holds if $f$ is convex.

The result can be proved by noticing that some convex analysis results can be extended to nonconvex functions, as follows.

Fact 1. The first-order optimality condition also holds for a nonconvex function: \begin{equation} x^* \in\arg\min_x f(x) \iff 0\in\partial f(x^*). \tag{1}\label{optimality} \end{equation}

This follows directly from the definition of the subgradient.

Fact 2. The Fenchel–Young inequality also holds for a nonconvex function: \begin{equation} f(x) + f^*(u) \ge \inner{u,x} \ \forall x,u. \tag{2}\label{fenchel} \end{equation}

This follows directly from the definition of the conjugate.

Fact 3. The equality case in the Fenchel–Young inequality is the same for a nonconvex function: \begin{equation} f(x)+f^*(u) = \inner{x,u} \Longleftrightarrow u \in \partial f(x). \tag{3}\label{fenchel-equality} \end{equation}

See here for a proof.

Now back to the main result. Let $z$ be such that $0\in\partial f(z) + \partial d_x(z)$. Because $\partial f(z) + \partial d_x(z) \subset \partial(f + d_x)(z)$ we have $0 \in \partial(f + d_x)(z)$ and thus $z\in\prox_f(x)$ according to \eqref{optimality}.

Denote $u=x-z$. Notice that $\partial d_x(z) = \{z - x\}$, we have $0\in\partial f(z) + z-x$, i.e. $\boxed{u \in \partial f(z)}$ and thus according to \eqref{fenchel-equality} we have \begin{equation} \inner{z,u} = f(z)+f^*(u). \tag{4}\label{zu} \end{equation} On the other hand, according to \eqref{fenchel}: \begin{equation} f(z) + f^*(v) \ge \inner{v,z} \ \forall v. \tag{5}\label{zv} \end{equation} Summing \eqref{zu} and \eqref{zv} we obtain: \begin{equation} f^*(v) \ge f^*(u) + \inner{z, v-u} \ \forall v, \end{equation} which means $\boxed{z\in\partial f^*(u)}$, implying \begin{align} x-u \in\partial f^*(u) \implies &0\in\partial f^*(u) + u-x \\ \implies &0 \in\partial f^*(u) + \partial d_x(u) \\ \implies &0\in \partial (f^* + d_x) (u) \\ \implies &u = \prox_{f^*}(x). \end{align} Therefore we have proved that $x=z+u \in \prox_f(x) + \prox_{f^*}(x)$. QED

I would say that the above is quite straightforward. A complete answer should provide a counter-example to \eqref{moreau} (if such example exists), or at least provide more insight into the assumption $\exists z: 0\in\partial f(z) + \partial d_x(z)$. Although I think this assumption is rather weak, I'm unable to say more.

P/s: From the proof, we have the following.

Fact 4. The following implication holds for a nonconvex function: \begin{equation} u\in\partial f(z) \implies z\in\partial f^*(u). \end{equation} If $f$ is convex then the converse also holds.


Update

In the above, I immediately generalized the Moreau decomposition to the inclusion \eqref{moreau} because of the nonconvexity of $f$. However, since Regev assumed that everything is well defined in his question, another more restricted view would be to assume that $\prox_f(x)$ is a singleton (as confirmed by Regev in his/her recent comment) so that the equality is kept instead of an inclusion: \begin{equation} x =\prox_f(x) + \prox_{f^*}(x).\tag{**}\label{moreau-equality} \end{equation}

If we assume further that the subdifferential $\partial f(z)$ is non-empty (which is a very mild assumption), then \eqref{moreau-equality} actually holds.

Corollary. If $\prox_f(x)$ is a singleton and the subdifferential $\partial f(\prox_f(x))$ is non-empty, then the Moreau decomposition \eqref{moreau-equality} holds.

Proof. Denote $z = \prox_f(x)$. Because $\prox_f(x)$ is a singleton, according to the above reasoning, we have $\partial(f + d_x)(z) = 0$ (with a slight abuse of notation, we denote the singleton set by the element itself). Hence, because $\partial f(z) \neq \emptyset$ and $\partial f(z) + \partial d_x(z) \subset \partial(f + d_x)(z) = 0$, the subdifferential $\partial f(z)$ must also be a singleton, and furthermore $\partial f(z) + \partial d_x(z) = 0$. This clearly satisfies the assumption made in the previous section and therefore we obtain \eqref{moreau-equality}.

The answer is now complete.

f10w
  • 4,709
  • (Reposting here just in case Regev's answer is deleted) Dealing with $\mathrm{prox}f = \mathrm{prox}{f^{}}$ seems to be difficult without using subgradients. But if we use subgradients, then the mild assumption stated in the update section of my answer is already very weak, and if we adopt this assumption, then obviously $\mathrm{prox}f = \mathrm{prox}{f^{}}$ is a consequence of the result. I'm not sure if a weaker assumption exists... – f10w Oct 20 '20 at 17:45
  • @Dirk Mind sharing your opinion on this? – f10w Oct 20 '20 at 21:29