2

Consider a standard ADMM problem:

minimize $f(x) + g(z)$ subject to $A x + B z = c$

The scaled ADMM algorithm is (from Boyd's paper):

$$ \begin{aligned} x^{k+1} &= \underset{x}{\operatorname{argmin}} \left( f(x) + \frac{\rho}{2} \left\| A x + B z^k - c + u^k \right\|_2^2 \right) \\\\ z^{k+1} &= \underset{z}{\operatorname{argmin}} \left( g(z) + \frac{\rho}{2} \left\| A x^{k+1} + B z - c + u^k \right\|_2^2 \right) \\\\ u^{k+1} &= u^k + A x^{k+1} + B z^{k+1} - c \end{aligned}$$

Define $\xi^k = (x^k, z^k, u^k)$. Each ADMM iteration will define a map $\Gamma$ from $\xi^k$ to $\xi^{k+1}$, that is $\xi^{k+1} = \Gamma(\xi^k)$. Suppose we have all the sufficient conditions so that the sequence $\{\xi^k\}$ generated by the ADMM (i.e., by applying the map $\Gamma$ sequentially) converges to the unique optimal solution $\xi^\star$ (which is the fixed point of $\Gamma$). I suspect $\Gamma$ is a contraction in the sense that:

$$ \left\|\Gamma(\xi) - \xi^\star \right\| \leq \alpha \left\|\xi - \xi^\star \right\|$$

for any $\xi$, with $0 < \alpha < 1$.

However, I couldn't find any paper with such a result. I searched for papers on linear convergence of ADMM but couldn't find any relevant results (perhaps I didn't look carefully enough).

I would appreciate any pointer to such a result (either a direct result or relevant results that can be used to show it).

Truong
  • 673

1 Answers1

2

Let's simplify a little and assume the constraint is $x-z=0$, that is $A=I$,$B=-I$ and $c=0$. Also, let's assume that $x^k,z^k,u^k \in \mathbb{R}^n$ for all $k \in \mathbb{N}$. Then the ADMM iterates should be \begin{align} x^{k+1} &= \arg\min_{x} \rho^{-1} f(x) + \frac12 \| x - z^k + u^k\|_2^2 = \text{prox}_{\rho^{-1} f}(z^k - u^k) \\ z^{k+1} &= \arg\min_{z} \rho^{-1} g(z) + \frac12 \| z - x^{k+1} - u^{k}\|_2^2 = \text{prox}_{\rho^{-1} g}(x^{k+1} - u^k) \\ u^{k+1} &= u^k + x^{k+1} - z^{k+1}. \end{align} It follows by induction that the sequence $(x^k,z^k,u^k)_{k \in \mathbb{N}}$ can be written in terms of a single variable sequence $(w^k)_{k \in \mathbb{N}}$ defined by $w^{k+1}=\Phi(w^k)$, where \begin{equation} \Phi = \frac12 \text{Id} + \frac12 ( 2\text{prox}_{\rho^{-1}f} -\text{Id}) \circ ( 2\text{prox}_{\rho^{-1}g} -\text{Id}). \tag{$\star$} \end{equation} In fact, it can be shown that \begin{align} \begin{split} z^k &= \text{prox}_{\rho^{-1} g} (w^k) = \alpha(w^k) \\ u^k &= (\text{Id} - \text{prox}_{\rho^{-1} g}) (w^k) = \beta(w^k) \\ x^{k+1} &= \text{prox}_{\rho^{-1} f} \circ ( 2 \text{prox}_{\rho^{-1} g} - \text{Id})(w^k) = \gamma(w^k) \end{split}\tag{$\star\star$} \end{align} for all $k \ge 1$. For a proof of this result using induction see Lemma $6.5$ of [1]. Another way of seeing this is by writing the Douglas-Rachford Splitting algorithm for the inclusion problem $0 \in \partial f(x) + \partial g(x)$, which will lead to ADMM. See Section $9.1$ of [4], or [5] for more information.

With this construction, using standard facts from convex analysis, we can show that $\Phi$ as defined in $(\star)$ is not necessarily a contraction, but actually a firmly-nonexpansive mapping, which is slightly weaker. To see that, you first remember that every proximal operator of a convex lower semicontinuous function is firmly-nonexpansive [2], and then we conclude that the mappings \begin{align} 2\text{prox}_{\rho^{-1} f} - \text{id} &= (1-2) \text{id} + 2\text{prox}_{\rho^{-1} f} \\ 2\text{prox}_{\rho^{-1} g} - \text{id} &= (1-2) \text{id} + 2\text{prox}_{\rho^{-1} g} \end{align} are nonexpansive (See Proposition $4.25 (ii)$ of [3]), and this means that the composition $(2\text{prox}_{\rho^{-1} f} - \text{id}) \circ (2\text{prox}_{\rho^{-1} g} - \text{id})$ is also nonexpansive. Finally, it follows that $\Phi$ is $1/2$-averaged, which is equivalent to firmly-nonexpansive.

Assuming that $\Phi$ has at least one fixed point, and using the Krasnoselskii-Mann Theorem, we have that the sequence $(w^k)_{k \in \mathbb{N}}$ converges to a fixed point of $\Phi$. Since the three mappings in $(\star\star)$ are continuous, it follows that ADMM converges.

  • [1] Pravin Nair, Ruturaj G. Gavaskar, Kunal N. Chaudhury, Fixed-Point and Objective Convergence of Plug-and-Play Algorithms, 2021.
  • [2] https://math.stackexchange.com/a/1900110/370982
  • [3] Heinz H. Bauschke, Patrick L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces.
  • [4] Ernest K. Ryu, Jialin Liu, Sicheng Wang, Xiaohan Chen, Zhangyang Wang, Wotao Yin, Plug-and-Play Methods Provably Converge with Properly Trained Denoisers.
  • [5] Eckstein, Jonathan. Splitting methods for monotone operators with applications to parallel optimization. Diss. Massachusetts Institute of Technology, 1989.
mlbj
  • 139