2

the background of my question is that I want to calculate an arbitrary square matrix $A \in \mathbb{R}^{n \times n}$ in a convex optimization problem and ensure it is regular, i.e., it can be inverted. For this reason, I would like to add a second, also convex summand to the optimization problem, or, alternatively, add a convex inequality constraint to the optimization problem. The addition of, e.g., $-log(|det(A)|)$ to the objective function ensures invertibility, but this term is not convex.

Ensuring invertibility in a convex way can be ensured by the methods described in Bounds for determinants with positive diagonals for diagonally dominant matrices or, allowing for a bit more general matrices, Verified bounds for singular values, in particular for the spectral norm of a matrix and its inverse (e.g., Lemma 2.1). This, however, restrics the matrices to have a certain structure, while general matrices would be much better for the optimization I have in mind.

From my research, it seems to be most promising to me to consider the smallest singular value of $A$ which equals the largest singular value of its inverse:

$\sigma_{\mathrm{min}}(A)=\frac{1}{\lVert A^{-1}\rVert_2}$,

where $\sigma_{\mathrm{min}(A)}$ denotes the smallest singular value of $A$ and $\lVert\lVert_2$ the spectral norm, i.e., the largest singular value. My key question is if one can obtain a (concave) positive lower bound for the smallest singular value using $\sigma_{\mathrm{min}}(A)=\frac{1}{\lVert A^{-1}\rVert_2}$? Maximizing this concave lower bound ensures that the eigenvalues of $A$ have absolute value $>0$, see this question. The bound itself can be very inaccurate, the main thing is that $\sigma_{\mathrm{min}}(A)>0 $ is ensured. As far as I have seen, using the triangle inequality (subadditivity) and submultiplicativity of matrix norms only leads to upper bounds for $\sigma_{\mathrm{min}}(A)$, but not to lower ones. Maybe it is possible to play with subadditivity and submultiplicativity as well as other matrix norm properties to obtain such a lower bound that is dependent on $A$ or its elements, but not on its inverse to ensure easy optimization?

Answers on this specific idea (lower bound for minimal singular value) or on how to ensure invertibility in a convex way in general would be greatly appreciated.

Minow
  • 127
  • I'm afraid that the minimum singular value of a nonsymmetric matrix is not concave. You know that, I see, but my point is that it's not supportable in practice without additional requirements. – Michael Grant Jun 20 '17 at 03:12
  • Yes, thanks, I am aware of that; I was thinking that although it is not concave, it maybe can be underestimated by some concave positive lower bound, which then could be maximized. – Minow Jun 20 '17 at 07:59
  • Are you aware of such a lower bound, which can be, e.g., obtained by using matrix norm inequalities? For example, using submultiplicativity, one can state $\lVert I \rVert_2 \leq \lVert A^{-1} \rVert_2 \lVert A \rVert_2$; rearranging gives an upper bound for the smallest singular value: $\frac{1}{\lVert A^{-1} \rVert_2} \leq \frac{\lVert A \rVert_2}{ \lVert I \rVert_2 }$ (which is convex, but that's a different issue). Maybe using such kind of manipulations would lead one closer to the solution of finding a concave lower bound? – Minow Jun 20 '17 at 07:59
  • Your comment also inspired a different thought: The Gram matrix $G=A^{\mathrm{T}}A$ of $A$ in fact is symmetric, and the square roots of its eigenvalues equal the singular values of $A$. Maybe the Gram matrix $A^{\mathrm{T}}A$ could be used instead of $A$ to indirectly enforce $\sigma_\mathrm{min}(A)>0$? – Minow Jun 20 '17 at 08:15
  • I am not optimistic. The set of singular matrices are like holes in Swiss cheese. – Michael Grant Jun 20 '17 at 11:28

1 Answers1

1

You may consider $\|A^{-1}\|_1$ instead of $\|A^{-1}\|_2$.

In order to approximate $\|A^{-1}\|_1$ you may use the algorithm of Hager Hager, W. W. (1984), Condition estimates, SIAM Journal on Scientific and Statistical Computing, 5(2), 311-316, which approximates the first norm of a linear operator $X$, $\|X\|_1$ using matrix-vector product only. Therefore in order to estimate $\|A^{-1}\|_1$ using this algorithm an efficient method of solving linear equations $A x = b$ is required.

This algorithm is implemented in Matlab, function condest for sparse and dense matrices, and in Lapack's function ?LACON.

Pawel Kowal
  • 2,362
  • I believe that if you have a good way to solve for systems with $A$, you can directly use any kind of algorithm to approximate actually $|A^{-1}|_2$ instead of $|A^{-1}|_1$. Special algorithms exist for $|A^{-1}|_1$ mainly because it is often used as a measure of sensitivity of linsolves instead of the 2-norm. – Algebraic Pavel Jun 20 '17 at 09:15
  • @ Pawel Kowal: thanks for the hint. The paper you cited in turn cites H.Wilkinson, The Algebraic Eigenvalue Problem, Oxford Univ. Press, London, 1965, where the so-called power method is used for approximating $\lVert A^{-1} \rVert_2$. So @AlgebraicPavel is right; however, the method for $\lVert A^{-1} \rVert_2$ is an iterative one, while solving linear equation systems sounds easier to me. Anyway; I assume by $\lVert A^{-1} \rVert_1$ you mean the Schatten 1-norm? In this case, we could add a constraint for the minimum singular value: – Minow Jun 20 '17 at 09:50
  • $\frac{1}{\lVert A^{-1} \rVert_2}>\epsilon$, from which follows $\frac{1}{\epsilon}>\lVert A^{-1} \rVert_2$, and because of the general inequality $\lVert B \rVert_1>\lVert B \rVert_2$ (arbitrary matrix $B$), enforcing $\frac{1}{\epsilon}>\lVert A^{-1} \rVert_1$ leads to $\frac{1}{\epsilon}>\lVert A^{-1} \rVert_2$. We then could use the method of the book Condition estimates you cited and use the approximation of $\lVert A^{-1} \rVert_1$ for enforcing the constraint. Does that make sense? – Minow Jun 20 '17 at 10:01
  • However, I am not sure about the convexity of $\lVert A^{-1} \rVert_1$; matrix norms are in general convex (we require a convex function here for still having a convex optimization problem), but what about the inverse? Or is it not necessary to consider convexity as we use an approximation? I fear we still have to... – Minow Jun 20 '17 at 10:02