2

I'd like to solve the following minimization problem

$$\min_{X_1,X_2} \mbox{nzc} (A+B_1X_1+B_2X_2)$$

where the $\mbox{nzc} (D)$ denotes the number of non-zero-columns in $D$, and where $X_i, A, B_i$ are matrices of appropriately chosen dimensions. Note that $\mbox{nzc} (D)$ is not a norm, rather it is somewhat similar to the $\ell_0$ "norm" which one can read about here, which is also not a norm.

How should I approach this? Is there a standard technique for solving optimization problems of this form?

Mathew
  • 1,928
  • 1
    A zero column would have zero 2-norm, and, hence, the Gramian gets a zero on the main diagonal. Can you do something with this? Unfortunately, I don't have much time to think about it at the moment. – Rodrigo de Azevedo Oct 31 '22 at 19:05
  • Where is $A$ in the minimization problem? What do you know about $B_1$ and $B_2$? – Idontgetit Oct 31 '22 at 20:50
  • Regarding the Gramian, I agree one can reformulate this problem as one that asks about the number of zeros on the gramians diagonal. from here we are doing an $l_0$ optimization over a quadratic function, however that still leaves me wanting to know how to do $l_0$ optimization. – Mathew Oct 31 '22 at 20:55
  • Ideally I'd like to handle arbitrary matrices, however realistically, my matrices are at most 10x10 with integer entries. – Mathew Nov 01 '22 at 01:59
  • A zero column would also have a zero 1-norm. – Rodrigo de Azevedo Nov 01 '22 at 16:41
  • if $D= A + B_1 X_1 + B_2 X_2 = 0 $ we have that $nzc(D)=0$. If we only think in square matrix, we could check spam of the columns vector $B_1$, $B_2$ . I think that this problem is equivalent to a max matrix rank problem. – Porufes Nov 04 '22 at 05:52

2 Answers2

2

Minimizing the number of non-zero columns in $D$ is equivalent to maximizing its zero-columns. Due to the way matrix multiplication works, the $n$th column of $B_{1}X_{1}$ depends only on the $n$th column of $X_{1}$.

So, denoting $M_{[n]}$ to be the $n$th column vector the matrix $M$, we can split the problem into $n$ independent sub-problems, one for each column of $D$:

$$\vec{D_{[c]}} = \vec{A_{[c]}}+B_1\vec{X_{1[c]}}+B_2\vec{X_{2[c]}}\\ c\in \{1\dots n\} $$

Now the aim is to see how many of the column vectors of $D$, that is , how many $\vec{D_{[c]}}$, can be set to $0$. The equation has solutions if and only if $A_{[c]}$ is situated in the subspace-sum of the image of $B_{1}$ and the image of $B_{2}$. Ie $\vec{A_{[c]}}\in (Im[B_1] + Im[B_2])$. Once this is checked for all columns of $A$, the maximum can be know.

More concretely, $B_{1}$. $B_{2}$ and $A$ can be considered as being "made up" of column vectors. A given column vector of $A$ can be made zero if and only if it is not linearly independent of the set made of the column vectors of $B_{1}$ and $B_{2}$. The minimum number of non-zero columns of $D$ is the number of columns of $A$ that are linearly independent with respect to that set.

user3257842
  • 4,526
2

Observe that $$ nzc(A+B_1X_1+B_2X_2) = nzr((A+B_1X_1+B_2X_2)^\intercal) $$ where $nzc$ and $nzr$ refer to the number of nonzero columns and rows respectively. Furthermore, the $\ell^{p,q}$ matrix norm for $1\leq p,q<\infty$ of an $m$ by $n$ matrix $M$ is given by $$ \|M\|_{p,q} = \bigg(\sum_{i=1}^n\bigg(\sum_{i=1}^m |M_{ij}|^p\bigg)^{\frac{q}{p}}\bigg)^{1/q}. $$ In particular, $$ \|M\|_{0,q} = |\{j \; | \;||M_j||_q \neq 0 \; \}| $$ where $M_j$ is the $j$-th row of $M$. Now recall that $$ ||M_j||_q \neq 0 \iff M_j = 0. $$ Hence, $\|M\|_{0,q}$ counts the number of non-zero rows of $M$. Therefore, $$ \min_{X_1,X_2}nzc(A+B_1X_1+B_2X_2) = \min_{X_1,X_2}\|(A+B_1X_1+B_2X_2)^\intercal)\|_{0,q}. $$

If you want to solve $\ell^0$ optimalisation problems, it is best to look at resources in the field of Compressed Sensing. One common technique there is to replace the $\ell^0$ with the $\ell^1$ norm. This is because $\ell^1$ is the lower convex envolope of $\ell^0$, and minimizers of $\ell^1$ tend to be sparse just like those of $\ell^0$. In this case that means solving $$ \min_{X_1,X_2}\|(A+B_1X_1+B_2X_2)^\intercal)\|_{1,q} $$ instead. Looking for algorithms like Bregman Iterations, thresholding algorithms or ADMM should give you something that might help you solve it efficiently.