Minimizing the difference of convex functions in MIP

Question

I am working on the following mixed-integer program (MIP):

$\min_{\mathbf{x}, \mathbf{y}} \ f(x_{1}) - f(x_{2}) + c_{x}^{\top}\mathbf{x} + c_{y}^{\top}\mathbf{y} \\ \text{s.t.} \\ A\mathbf{x} + B\mathbf{y} \geq b ,\\ 1 \geq x_{1} \geq x_{2} \geq 0,\\ \mathbf{x} = [x_{1}, x_{2}, ..., x_{n}]^{\top} \in \mathbb{R}^{n}, \mathbf{y} \in \{0,1\}^{m}$

where $f(z) = -(1-z)^{0.5}$ is a convex function. However, a quick Hessian check reveals that $f(x_{1}) - f(x_{2})$ is not convex, which is not good news.

My question is, how to efficiently solve this problem (perhaps casting this as an equivalent/approximate MILP)? Is there any way to exploit the "special" structure of $f(x_{1}) - f(x_{2})$ ?

My attempts:

I employed piecewise-linear approximation for $f(x_{1})$ and $f(x_{2})$ using SOS-2 constraints (using the modeling approach given here). Unfortunately, the resultant MILP turns out to be too complex. The optimality gap never goes below 2-3% even after several hours. In particular, the lower bound does not improve which could be an indication of weak formulation.
For $f(x_{1})$, I used an approximate representation $f(x_{1}) = \max_{i} \{a_{i}x_{1} + b_{i}\}$, and cast the problem as:

$\min_{\mathbf{x}, \mathbf{y},t} \ t - f(x_{2}) + c_{x}^{\top}\mathbf{x} + c_{y}^{\top}\mathbf{y} \\ \text{s.t.} \\ A\mathbf{x} + B\mathbf{y} \geq b ,\\ 1 \geq x_{1} \geq x_{2} \geq 0,\\ t \geq a_{i}x_{1} + b_{i}, \forall i\\ \mathbf{x} = [x_{1}, x_{2}, ..., x_{n}]^{\top} \in \mathbb{R}^{n}, \mathbf{y} \in \{0,1\}^{m}, t\in\mathbb{R}$

Then, I employed the convex-concave procedure (CCP), explained here [Alg. 1.1]. With CCP, each iteration is a MILP and the algorithm converges after 3-4 iterations (each iteration takes around 900 secs). The performance is much better than the 1st attempt. However, I would really wish for a faster approach (or, at least, and approach that does not work iteratively). CCP is for $f(\mathbf{x}) - g(\mathbf{x})$, where $f$ and $g$ are convex. Mine is a very special case which essentially is something like $f(e_{1}^{\top}\mathbf{x}) - f(e_{2}^{\top}\mathbf{x})$, where $e_{k}$ is the standard unit vector of direction $k$. Given this, could there be any improvement?

Some side notes (I don't know if they matter or not):

Size of the problem: $ n \geq $ 10k and $ m \geq $ 25k, number of rows $ \geq $ 40k.
I use YALMIP with GUROBI solver. Without the troublesome part $f(x_{1}) - f(x_{2})$ of the objective, the MILP is solved to optimality within few seconds.

Hmm... What in your setup prevents $x_1$ from approaching $1$ when $x_2$ stays separated from it thus sending the minimum to $-\infty$? The best strategy to use may depend upon the answer to this question. — fedja, Aug 04 '22 at 02:13
@fedja hi, sorry, the function should read $f(z) = -(1-z)^{0.5}$. So $f(x_{1}) - f(x_{2})$ is bounded between 0 and 1. I rectified the mistake. — cbs, Aug 04 '22 at 03:56
OK, thanks for the correction. One thing still strikes me as incongruous: you say that without $f(x_1)-f(x_2)$ the problem is solved "within few seconds" but with adding rather few (compared to the size of the main matrix) constraints of the type $t\ge a_ix_1+b_i$ the solving time increases to 900 seconds (100-fold). Is it really so? — fedja, Aug 07 '22 at 13:37
@fedja Well... Yes, the durations are really correct. But the actual problem is slightly more complicated than I have indicated. The convex-concave objective is actually of the form $\sum_{j=1}^{N} f(x_{1,j}) -f(x_{2,j})$, where $x_{1,j} , x_{2,j} \in \mathbb{R}$, $1 \geq x_{1,j} \geq x_{2,j} \geq 0$, and $N$ is around 400. So, there are a great number of constraints of the form $t_j \geq a_i x_{1,j} + b_i$ (around 2k). — cbs, Aug 08 '22 at 15:57
I see. Then I'm curious what will happen if you add the constraints $t_j\ge a_ix_{1,j}+b_i$ one by one instead of throwing them in immediately every time using the linearization at the point $x_{1,j}$ produced by the previous iteration and keeping the previous ones. — fedja, Aug 08 '22 at 16:32

Minimizing the difference of convex functions in MIP

0 Answers0