0

Wikipedie claims here (at the end of "Applications") that the function $F:\mathbb {R}^{N}\rightarrow \mathbb R$, $\alpha\mapsto \alpha^{T}\left(A^{T}A + \lambda A\right)\alpha - 2\alpha^{T}Ay$, where $\mathbb R\ni\lambda > 0$. The matrix $A\in \mathbb R^{N\times N}$ is a positive-definite matrix (because it comes from a kernel $K$, but that's not so important here).

I have problems proving that $F$ is indeed convex. What we need to show according to this Wikipedia article is that $\forall t\in\left[0, 1 \right], \forall x_1, x_2 \in \mathbb R^{N}: F\left( tx_1 + (1-t)x_2 \right) \leq tF(x_1) + (1-t)F(x_2)\quad (\star)$.

Now, I calculated the LHS of $(\star)$ and found it to be the RHS plus two additional terms: $$ (1-t)tx_1^{T}\left( A^{T}Ax_2 + x_{2}^{T}A^{T}Ax_1 \right)+ \lambda\left( 1-t \right)t\left( x_{2}^{T}Ax_1 + x_1^{T}Ax_2 \right).$$ The problem is: I don't know how to show that these additional terms are negative, which they have to be if the function $F$ is supposed to be convex.

Hermi
  • 1,021
  • In this case it should be easier to use the Hessian matrix for convexity. You have already a positive definite matrix $A$ to start with. – TomTom314 May 17 '21 at 15:25
  • @TomTom314 Depends whether OP is allowed to use second order conditions in their proof. – V.S.e.H. May 17 '21 at 19:10

1 Answers1

0

Note that the gradient of the function $x \mapsto x^TBx$ is $(B^T + B)x$. Since $A$ is symmetric, $$\nabla F(a) = 2(A^TA + \lambda A)a - 2Ay.$$ Thus the Hessian is $$D\nabla F(a) = 2(A^TA + \lambda A).$$ Since the Hessian is positive definite, $F$ is convex.

Edit: Since $A$ is positive definite and $\lambda > 0$, both $A^TA$ and $\lambda A$ are positive definite (note $x \cdot A^TAx = \lVert Ax \rVert^2 > 0$ if $x \neq 0$). The sum of positive definite matrices is positive definite, so $D \nabla F(a)$ is indeed positive definite.

Mason
  • 12,787
  • Here are some: https://math.stackexchange.com/questions/4089006/intuition-convexity-of-multivariate-functions-and-positive-semidefiniteness-of/4090853#4090853, https://math.stackexchange.com/questions/946156/proving-convexity-of-a-function-whose-hessian-is-positive-semidefinite-over-a-co, https://math.stackexchange.com/questions/720259/f-is-convex-function-iff-hessian-matrix-is-nonnegative-definite – Mason May 18 '21 at 14:41
  • Hey Mason, I deeply apologize for getting back so late to you.. You wrote that the Hessian is given by $D\nabla F(\alpha)$, and that the Hessian would be positive definite.. How do I see that the Hessian is indeed positive? So let $z\in\mathbb R^{N}$ \ ${0}$, then we need to prove that $z^T (A^TA + \lambda A)z = \dots = z^T A^2z + \lambda z^T A z$ is $>0$. For the second term, this is obvious, since by assumption $A$ is positive definite, but what about the first term? – Hermi May 29 '21 at 20:52
  • @Hermi I added an edit to explain it. To see that $A^TA$ is positive definite, use the fact that $x \cdot B^Ty = Bx \cdot y$ for any matrix $B$ and vectors $x, y$. – Mason May 29 '21 at 21:08
  • yup, you're right! Do you think you might be able to take a look at this question as well: https://math.stackexchange.com/questions/4155840/argmin-understanding-a-simple-calculation

    I'd of course be very happy about this!

    – Hermi May 30 '21 at 13:04