2

I'am trying to prove the following binary quadratic integer programming problem NP hard.

$$ \min \frac{\sum\limits_{i=1}^m(u_i-\bar u)^2}{m}\text{ , where }u=Q x,Q\in\mathbb{R}^{m\times n}\\ s.t. \begin{cases} Ax\leq b, \ A\in\mathbb{R}^{m\times n},b\in\mathbb{R}^m \\Cx=d, \ C\in \mathbb{R}^{m\times n},d\in\mathbb{R}^m\\ \forall i: x_{i}\in\{0,1\}\end{cases} $$

  • x is a n-dimensional programming variable.

  • u is a m-dimensional vector, and $\bar u = \frac{\sum\limits_{i=1}^mu_i}{m}$.

  • A, C, Q are constant coefficient matrices.

  • b, d are constant coefficient vectors.

I guess it could be reduced into the knapsack problem. But because of the complexity of the target function, I haven't figured out how to relate it to the knapsack problem.

OvinaSun
  • 23
  • 3

1 Answers1

1

This problem is NP-hard. Well, to be precise, currently, it is not specified what happens when there is no $x$ that satisfies all constraints. Determining whether a valid solution for $x$ exists is NP-hard, as the integer linear program feasiblity is a special case of this problem. But I think you're more interested in the optimization part.

So, suppose $A,B,c,d$ are chosen such that there always is a solution for $x$ that satisfies the constraints. Then the problem is still NP-hard. Let $\mathsf{IS'}$ be the problem of finding a maximum independent set in graphs where there exists an independent set of size at least half the number of nodes. For every graph $G$ with $n$ vertices, we can create a graph $G'$ that is a valid input for $\mathsf{IS'}$ by adding $n$ isolated vertices to $G$. With this observation, it is easy to show $\mathsf{IS'}$ is NP-hard via a reduction from the ordinary independent set problem.

We can now encode $\mathsf{IS'}$ as an instance of this optimization program, which shows the problem is NP-hard. For simplicity, I'm assuming $A$ may have more rows than $Q$. We can obtain these extra rows by padding $x$ with extra values of which we can fix the sum using $C$.

Given a graph $G$ of $n$ vertices, let $x$ be a vector where the indices $i$ correspond to vertices $v_i$ in $G$.

For each edge $(v_i,v_j)$ in $G$, add the constraint $x_i+x_j\leq 1$. Also, add the constraint $\sum_{i=1}^n x_i \geq n/2$. Let $Q$ be the identity matrix, $C$ and $d$ can be set to $0$. Note that the construction of $\mathsf{IS}'$ guarantees there exists an $x$ that satisfies these constraints.

Due to the edge constraint, the set of vertices $v_i$ where $x_i=1$ is an independent set in $G$. Due to the second constraint, at least half of the entries of $x$ are set to $1$. The objective function calculates the variance of $Qx= x$, which is strictly decreasing in the number of $1$-entries under the condition that at least half of the entries are $1$. So, a minimum objective function for $x$ corresponds to maximum independent set in $G$.

Discrete lizard
  • 8,392
  • 3
  • 25
  • 53