Let $P \in \mathbb{R}^{N \times N}$ be a given symmetric matrix. Specially, $P$ has all zero entries on its diagonal, and all its off-diagonal entries are positive. And I want to minimize $$\begin{equation} \begin{aligned} \min_{H \in \{0,1\}^{N \times K} } \quad & {\frac{1}{2} \text{tr}(H^T PH)} \\ \textrm{subject to} \quad & H_{ij} \in \{0,1\}, \text{tr}(H^T H) = N,\\ & H^T H \text{ is a diagonal matrix}. \\ \end{aligned} \end{equation}$$ Where $H \in \mathbb{R}^{N \times K}, K<N$, both $K,N$ are known positives. The last constraint is equivalent to: each column of $H$ should be orthogonal (but not necessarily orthonormal).
Some of my ideas:
- The objective looks like the Rayleigh-Ritz. However, Rayleigh-Ritz needs $H^T H=I.$
- Rewrite the objective into: $\text{tr}(H^T P H)=\text{tr}(P HH^T)= P\bullet (HH^T) = P \bullet V$, where we define $V = HH^T$. This is an inner product of $P$ and $HH^T$. Here the inner product is defined for symmetric matrices $A,B$ as $A \bullet B = \text{tr}(A B)$. Thus the objective is a linear function in $V$. So I tried to convert the original problem into a convex problem:
Constraints of $H$ $\implies$ $V_{ii}=1$. This is convex in $V$. This also implies the following things: $\text{tr}(H^T H) = N$ $\iff \text{tr}(V)=N = \| V \|_*$, the nuclear norm of $V$. This is convex in $V$
Since $V$ plays the role of $HH^T$. $V$ is PSD. This is convex in $V$. This constraint also makes sure that all principal submatrices of $V$ are PSD. Which excludes the case that $\exists $ some distinct nodes $i,j,k \in [N]$ such that $V_{ij}=1, V_{jk}=1$ but $V_{ik}=0$.
Constraints $\implies$ $V_{ij} \in \{0,1\}$. This is not convex in $V$. I tried to relax this into $V_{ij} \in [0,1].$
$H \in \mathbb{R}^{N \times K} \implies \text{rank}(V) \leq K$. This is not convex in $V$. Note that the usual nuclear norm penalty is not applicable here, since $\|V \|_*=N$ always.
Note that $\boldsymbol{1}^T V \boldsymbol{1} = \boldsymbol{1}^T HH^T \boldsymbol{1} = C_1^2 + ...+C_K^2 \geq \frac{(C_1+...+C_K)^2}{K}=\lceil \frac{N^2}{K} \rceil$, where $C_k$ denotes the number of nodes belongs to $k$-th subset. This is convex in $V$. This implies the larget cluster size should be $\geq \frac{N}{K}$, which is the pigeonhole principle.
Extending this rationale, we should deduce that for all $(K+1) \times (K+1)$ principal submatrices of $V$, denoted as $V_{\mathcal{U}},$ the sum of all its entry is $\geq K+3$. Put it in Math, $\forall \mathcal{U} \subset [N],$ with $ |\mathcal{U}|=K+1, $ we require $\boldsymbol{1}^T V_{\mathcal{U}} \boldsymbol{1} \geq K+3.$
I do not know if we can add more convex constraints and arrive at a convex (relaxed, but yet suitable) problem eventually.