I am working on the following mixed-integer program (MIP):
$\min_{\mathbf{x}, \mathbf{y}} \ f(x_{1}) - f(x_{2}) + c_{x}^{\top}\mathbf{x} + c_{y}^{\top}\mathbf{y} \\ \text{s.t.} \\ A\mathbf{x} + B\mathbf{y} \geq b ,\\ 1 \geq x_{1} \geq x_{2} \geq 0,\\ \mathbf{x} = [x_{1}, x_{2}, ..., x_{n}]^{\top} \in \mathbb{R}^{n}, \mathbf{y} \in \{0,1\}^{m}$
where $f(z) = -(1-z)^{0.5}$ is a convex function. However, a quick Hessian check reveals that $f(x_{1}) - f(x_{2})$ is not convex, which is not good news.
My question is, how to efficiently solve this problem (perhaps casting this as an equivalent/approximate MILP)? Is there any way to exploit the "special" structure of $f(x_{1}) - f(x_{2})$ ?
My attempts:
I employed piecewise-linear approximation for $f(x_{1})$ and $f(x_{2})$ using SOS-2 constraints (using the modeling approach given here). Unfortunately, the resultant MILP turns out to be too complex. The optimality gap never goes below 2-3% even after several hours. In particular, the lower bound does not improve which could be an indication of weak formulation.
For $f(x_{1})$, I used an approximate representation $f(x_{1}) = \max_{i} \{a_{i}x_{1} + b_{i}\}$, and cast the problem as:
$\min_{\mathbf{x}, \mathbf{y},t} \ t - f(x_{2}) + c_{x}^{\top}\mathbf{x} + c_{y}^{\top}\mathbf{y} \\ \text{s.t.} \\ A\mathbf{x} + B\mathbf{y} \geq b ,\\ 1 \geq x_{1} \geq x_{2} \geq 0,\\ t \geq a_{i}x_{1} + b_{i}, \forall i\\ \mathbf{x} = [x_{1}, x_{2}, ..., x_{n}]^{\top} \in \mathbb{R}^{n}, \mathbf{y} \in \{0,1\}^{m}, t\in\mathbb{R}$
Then, I employed the convex-concave procedure (CCP), explained here [Alg. 1.1]. With CCP, each iteration is a MILP and the algorithm converges after 3-4 iterations (each iteration takes around 900 secs). The performance is much better than the 1st attempt. However, I would really wish for a faster approach (or, at least, and approach that does not work iteratively). CCP is for $f(\mathbf{x}) - g(\mathbf{x})$, where $f$ and $g$ are convex. Mine is a very special case which essentially is something like $f(e_{1}^{\top}\mathbf{x}) - f(e_{2}^{\top}\mathbf{x})$, where $e_{k}$ is the standard unit vector of direction $k$. Given this, could there be any improvement?
Some side notes (I don't know if they matter or not):
- Size of the problem: $ n \geq $ 10k and $ m \geq $ 25k, number of rows $ \geq $ 40k.
- I use YALMIP with GUROBI solver. Without the troublesome part $f(x_{1}) - f(x_{2})$ of the objective, the MILP is solved to optimality within few seconds.