Relaxation of $\min_{H} \text{tr}(H^T P H)$

Question

Let $P \in \mathbb{R}^{N \times N}$ be a given symmetric matrix. Specially, $P$ has all zero entries on its diagonal, and all its off-diagonal entries are positive. And I want to minimize $$\begin{equation} \begin{aligned} \min_{H \in \{0,1\}^{N \times K} } \quad & {\frac{1}{2} \text{tr}(H^T PH)} \\ \textrm{subject to} \quad & H_{ij} \in \{0,1\}, \text{tr}(H^T H) = N,\\ & H^T H \text{ is a diagonal matrix}. \\ \end{aligned} \end{equation}$$ Where $H \in \mathbb{R}^{N \times K}, K<N$, both $K,N$ are known positives. The last constraint is equivalent to: each column of $H$ should be orthogonal (but not necessarily orthonormal).

Some of my ideas:

The objective looks like the Rayleigh-Ritz. However, Rayleigh-Ritz needs $H^T H=I.$
Rewrite the objective into: $\text{tr}(H^T P H)=\text{tr}(P HH^T)= P\bullet (HH^T) = P \bullet V$, where we define $V = HH^T$. This is an inner product of $P$ and $HH^T$. Here the inner product is defined for symmetric matrices $A,B$ as $A \bullet B = \text{tr}(A B)$. Thus the objective is a linear function in $V$. So I tried to convert the original problem into a convex problem:

Constraints of $H$ $\implies$ $V_{ii}=1$. This is convex in $V$. This also implies the following things: $\text{tr}(H^T H) = N$ $\iff \text{tr}(V)=N = \| V \|_*$, the nuclear norm of $V$. This is convex in $V$
Since $V$ plays the role of $HH^T$. $V$ is PSD. This is convex in $V$. This constraint also makes sure that all principal submatrices of $V$ are PSD. Which excludes the case that $\exists $ some distinct nodes $i,j,k \in [N]$ such that $V_{ij}=1, V_{jk}=1$ but $V_{ik}=0$.
Constraints $\implies$ $V_{ij} \in \{0,1\}$. This is not convex in $V$. I tried to relax this into $V_{ij} \in [0,1].$
$H \in \mathbb{R}^{N \times K} \implies \text{rank}(V) \leq K$. This is not convex in $V$. Note that the usual nuclear norm penalty is not applicable here, since $\|V \|_*=N$ always.
Note that $\boldsymbol{1}^T V \boldsymbol{1} = \boldsymbol{1}^T HH^T \boldsymbol{1} = C_1^2 + ...+C_K^2 \geq \frac{(C_1+...+C_K)^2}{K}=\lceil \frac{N^2}{K} \rceil$, where $C_k$ denotes the number of nodes belongs to $k$-th subset. This is convex in $V$. This implies the larget cluster size should be $\geq \frac{N}{K}$, which is the pigeonhole principle.
Extending this rationale, we should deduce that for all $(K+1) \times (K+1)$ principal submatrices of $V$, denoted as $V_{\mathcal{U}},$ the sum of all its entry is $\geq K+3$. Put it in Math, $\forall \mathcal{U} \subset [N],$ with $ |\mathcal{U}|=K+1, $ we require $\boldsymbol{1}^T V_{\mathcal{U}} \boldsymbol{1} \geq K+3.$

I do not know if we can add more convex constraints and arrive at a convex (relaxed, but yet suitable) problem eventually.

If you let $H$ to be a permutation matrix, your problem is the same with Quadratic Assignment Problem (https://en.wikipedia.org/wiki/Quadratic_assignment_problem). According to Wikipedia: "The problem is NP-hard, so there is no known algorithm for solving this problem in polynomial time, and even small instances may require long computation time. It was also proven that the problem does not have an approximation algorithm running in polynomial time for any (constant) factor, unless P = NP". — obareey, Apr 06 '24 at 13:48
@obareey Here I need $H$ to be non-square, does it still fits the quardratic assignment problem setting? — SouthChinaSeaPupil, Apr 06 '24 at 14:18
I believe so. Because, you are just reducing the problem size but the core of the problem remains. I rethink your problem and it might be slightly less difficult than QAP because of the restrictions on $H$, but the two problems are definitely related. — obareey, Apr 06 '24 at 15:23
@南洋小學生 Ok, have you tried any numerical simulations? The problem is generally hard due to binary essence of $H$ — GBmath, Aug 26 '24 at 15:46

SouthChinaSeaPupil · Answer 1 · 2024-08-21T08:22:38.623

It may be considered not very appropriate for one to answer his own question. But I think keep adding the list in the question body will make the question too long.

The observation is that for any $3 \times 3$ principal submatrix $V_{\{i,j,k\}}$. If we remove the ones on the diagonal, in the ideal case where all entries are all $\in \{0,1\}$ what we get is an adjacency matrix of a graph with $3$ nodes. There are four (upto isomorphism, simple) graphs with $3$ nodes: $P_2$(a path of length 2), $ K_3$ (a triangle), $K_1 \cup K_2$ (one dot + an edge), $K_1 \cup K_1 \cup K_1$ (three dots). In $V=HH^T$, every 3 by 3 principal submatrix can be any of the latter three, but not $P_2$.

That is, we should allow $V_{\{i,j,k\}}-I_3$ to be one of the following: $$\mathcal{S} := \Big\{ \underbrace{\begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0\end{bmatrix}}_{K_1 \cup K_1 \cup K_1}, \underbrace{\begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 0\end{bmatrix}, \begin{bmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 1 & 0 & 0\end{bmatrix}, \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0\end{bmatrix}}_{K_1 \cup K_2}, \underbrace{\begin{bmatrix} 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 0\end{bmatrix}}_{K_3} \Big \}.$$ In any case, elements in $\mathcal{S}$ should be feasible. Hence in the context of relaxation, we consider the convex hull of $\mathcal{S}$, somehow the smallest convex set containing those feasible things.

Namely, $ \forall i,j,k \in [N]: i\neq j , j\neq k, i\neq k,$ we require $V_{\{i,j,k\}}$ to satisfy $$V_{\{i,j,k\}} \in \text{CvxHull}(\mathcal{S}).$$ Since $V_{\{i,j,k\}}$ is symmetric, controlling its upper-triangule is sufficient (and necessary). Hence we simplify things to $$[V_{\{i,j,k\}}(1,2),V_{\{i,j,k\}}(1,3),V_{\{i,j,k\}}(2,3)] \in \text{CvxHull}([0,0,0],[1,0,0],[0,1,0],[0,0,1],[1,1,1]).$$

Relaxation of $\min_{H} \text{tr}(H^T P H)$

1 Answers1