Minimize $\| A x - b \|_{2}^{2}$ Subject To $\| x \|_2 = 1$ and $x \succeq 0$ (Least Squares with Inequality and Non Linear Equality of $ L_2 $ Norm)

Question

Given $y \in \mathbb R^n$ and $A \in \mathbb R^{n \times n}$, whis is some way for

$$\min_x \| y- Ax\|$$ subject to $\|x\|=1$, and $x \geq 0$ (which means every components of $x$ is nonnegative)?

Is there any book discussing such a problem? Thanks!

Remark: The objective functions $\left\| A x - y \right\|$ and $\frac{1}{2} {\left\| A x - y \right\|}$ are equivalent, while the latter is differentiable and easier to handle.

score 5 · Accepted Answer · answered Mar 16 '13 at 16:00

5

You didn't specify but I presume you mean to use the Euclidian norm. Your problem is nonconvex so in general you won't be able to find an analytical expression for a solution.

The problem without the bound constraints $$ \min \|Ax-b\| \quad \text{subject to} \ \|x\| = \Delta $$ is well understood and, despite the fact that it is nonconvex, we have a characterization of its global solutions from the Gay-Dennis-Welsh theorem. See for instance Theorem 7.2.1 in http://www.ec-securehost.com/SIAM/MP01.html It may be found numerically using the method of Moré and Sorensen (described in the same book).

The problem without the norm constraint is also well understood and you'll find in fact methods for the problem where the norm constraint is a inequality instead of an equality, i.e., $$ \min \|Ax-b\| \quad \text{subject to} \ \|x\| \leq \Delta, \ x \geq 0 $$ in standard textbooks on linear least-squares problems because this problem is closely related to the regularized least-squares problem $$ \min \|Ax-b\| + \delta \|x\| \quad \text{subject to} \ x \geq 0. $$ These are convex problems. See for instance Chapter 5 in http://www.ec-securehost.com/SIAM/ot51.html or Chapters 20-23 in http://www.ec-securehost.com/SIAM/CL15.html

As someone else mentioned, interior-point methods are of interest for the convex problem ($\|x\| \leq \Delta$) but they can also be applied to the nonconvex problem ($\|x\| = \Delta$). However, you'll have to apply a generic interior-point method that won't be able to exploit your least-squares structure. I'm not aware of a method specifically designed for your problem.

I hope this already helps.

(I'm not affiliated with SIAM but they happen to have a great book collection.)

answered Mar 16 '13 at 16:00

Dominique

3,224

Thanks! Yes, the norm in my post is Euclidean norm. I was wondering, if the norm constrainit doesn't exist, just the nonnegative bound constraint, is there an analytical solution? Is it possible to construct the solution of my original problem from the solution of this relaxed problem? – Tim Mar 16 '13 at 20:25
@steveO: I don't think there is an analytical solution. The optimality conditions involve complementarity conditions: $A^T (Ax-b) \geq 0$, $x \geq 0$ and componentwise products must vanish, i.e., $x_i (A^T (Ax-b))_i = 0$. – Dominique Mar 16 '13 at 20:32
Thanks! What are some non-analytical ways and references for this bound only problem? (I know you pointed out the references for the problem with both bounded constraint and norm constraint, but is it easier if there is only bounded constraint?) – Tim Mar 16 '13 at 20:35
@steveO: do you have access to $A$ explicitly? How large is it? Do you have an idea of its conditioning? – Dominique Mar 16 '13 at 20:36
(1) I will have to ask my friend next week for more details. He asked me this question, and I thought this problem seemed easy and Golub's Matrix Computation might already have an answer, but reading the replies here, I realized I was wrong. (2) Meanwhile, I am wondering what conditioning of A has to do with the problem with bounded constraint only and the problem with both bound and norm constraints? – Tim Mar 16 '13 at 20:38
Conditioning is important even for the unconstrained least-squares problem. It may help us decide on the best way to approach your problem. In the meantime I would recommend an interior-point approach for the bound-constrained problem. You can try PDCO – Dominique Mar 16 '13 at 20:50
Thanks, Dominique! I am now reading the three references you recommended. (1) I didn't see the reasons for how "the problem where the norm constraint is a inequality" is related to "the regularized least-squares problem"? (2) Also how are "the problem where the norm constraint is a inequality" and "the problem where the norm constraint is an equality" related, and why can interior-point methods for the convex problem (∥x∥≤Δ) but they also be applied to the nonconvex problem (∥x∥=Δ)? – Tim Mar 16 '13 at 21:22
@steveO For (1) just compare the optimality conditions. For (2) the two problems are loosely related. Interior methods can treat equality constraints (much in the same way as SQP methods) but keep in mind that for the (nonconvex) equality-constrained problem, you may not end up with a minimizer, only with a stationary point. – Dominique Mar 17 '13 at 00:23
Thanks, Dominique! (1) I still can't figure out how the regularized LS problem and the constrained LS problem are equivalent? I checked that their KKT conditions are not exactly the same? I asked in a new post here http://math.stackexchange.com/questions/335306/why-are-additional-constraint-and-penalty-term-equivalent-in-ridge-regression. (2) For the regularized least-squares problem without the nonnegative constraint, I find from some reference that there is an analytical solution. But for the regularized least-squares problem with the nonnegative constraint, any reference for its solution? – Tim Mar 20 '13 at 02:06
I posted the second part here http://math.stackexchange.com/questions/335468/how-would-you-solve-a-tikhonov-regularized-least-squares-problem-with-nonnegativ – Tim Mar 20 '13 at 03:44
(3) I checked the link for PDCO. It is for linear constraint. But my original problem has a constraint $|x|=1$ which isn't linear? – Tim Mar 20 '13 at 16:18

score 3 · Answer 2 · answered Mar 26 '13 at 20:56

The problem does not have an analytical solution but can be easily solved using a projected gradient method. Let us rewrite your problem in an equivalent form:

\begin{align} \min_x~& \frac{1}{2}\Vert Ax-b\Vert^2\\ s.t.~& \Vert x\Vert =1\\ & x\geq 0\end{align}

The method works as follows:

Let $x_0$ be a feasible point (eg $x_0=0$), and let $k=1$.
At iteration $k$, perform a gradient step followed by a projection step. For the grdient step, let $\tilde{x}_k=x_{k-1}-\alpha_k g_{k-1}$, where $g_{k-1}=A^T(Ax_{k-1}-b)$ is the gradient of the objective function at $x_{k-1}$ and $\alpha_k$ is the step length, found for example via a line search. Then for the projection step, let $\tilde{x}_k=max(\tilde{x}_k,0)$ (component-wise) and then $x_k=\tilde{x}_{k}/\Vert \tilde{x}_{k}\Vert$. If the stopping criterion is met, stop, otherwise let $k=k+1$ and perform the step again.

As an illustration, I have let $A=\begin{pmatrix} 1&0\\0&3\end{pmatrix}$ and $b=\begin{pmatrix} 3\\2\end{pmatrix}$ and obtained the following iterates: enter image description here

Thanks, Joe! Glad to know the projected gradient method! – Tim Apr 07 '13 at 16:42 — Tim, Apr 07 '13 at 16:42

score 2 · Answer 3 · answered Mar 14 '13 at 22:52

2

How about using Lagrange function? Let $\|\cdot\|$ be the euclidian norm, then You can define the Lagrangian as $$ L(x,\lambda,\mu)=\|y-Ax\|^{2}+\lambda\cdot(\|x\|^{2}-1)+\mu\cdot x, $$ where $\mu\in R^{n}$ is the Lagrange parameter for the inequality constraint and $\lambda\in R$ for the equality constraint. I squared the norms to get differentiability. This should be ok. Now You could compute derivatives etc...to get points which are candidates for optimal points (then use constraint qualifications...) Maybe this works and is helpful?

answered Mar 14 '13 at 22:52

Alex

317

Thanks! (1) Is there a analytical optimal solution, without using numerical methods? (2) Is my problem a quadratic programming problem? (I think the constraints in a QP problem must be linear, but here $|x|=1$ is quadratic.) – Tim Mar 16 '13 at 00:01
Good question :). Without the positivity constraints I think it is very easy to calculate the solution...If we also handle the inequality constraints that can be getting hard...since in a way it will become a combinatorial question of which inequalities (we have $n$) are active.... It is not a QP but several numerical algorithms also can handle quadratic constraints... (since the derivative is again linear (if you square the constraints)). – Alex Mar 16 '13 at 01:12

score 2 · Answer 4 · answered Mar 14 '13 at 23:02

2

If $A$ is non-singular, then I think $x = A^{-1}y / ||A^{-1}y||$ is the solution (the image of the unit ball under $A$ will be convex, so the minimum is attained when $Ax \parallel y$).

If $A$ is singular, I think you can start like you would without the $||x|| = 1$ restriction - replace $y$ with the orthogonal projection of $y$ onto the column space of $A$ (call it $y'$), then solve $\min||y'-Ax||$ like in the first case (find $x'$ so $Ax' = y'$, then normalize $x'$ to find $x$).

answered Mar 14 '13 at 23:02

BaronVT

13,751

Thanks! (1) I was wondering why if $A$ is nonsingular, the solution is what you gave? What does $Ax∥y$ mean? (2) what if the constraint $|x|=1$ is replaced with $|x|=c$ for some $c >0$? – Tim Mar 15 '13 at 23:59
Hi @BaronVT, (1) others' replies seem to say there is no analytical or closed-form solution to the problem. So is the closed-form solution you gave when $A$ is nonsingular a correct one? (2) Is there some reason for your method when $A$ is singular? – Tim Mar 20 '13 at 12:25
@BaronVT It is not true that when $A$ is invertible, the constrained solution is $x = A^{-1}y/|A^{-1}y|$. A simple counterexample is $A = [1000, 0; 0, 1]$, $y = [1000; 1]$. The unconstrained solution is $x = [1;1]$, the normalized solution (that you suggest) is $x_N = [1;1]/\sqrt{2}$ with $|Ax_N-y|^2 = (10^6+1)/2$, but the "unexpected" point $x_U = [1,0]$ has much lower cost: $|Ax_U-y|^2 = 1$. – MathMax Mar 21 '25 at 16:44

score 2 · Answer 5 · answered Mar 15 '13 at 02:29

In simple form the system is $$min_{x_1,x_2,...,x_n}\sqrt{\sum_{i=1}^n (y_i-A_{i,1}x_1-A_{i,2}x_2-...-A_{i,n}x_n)^2}$$ $$s.t.\quad \sqrt{\sum_{i=1}^n x_i^2}-1=0$$ where the Lagrangian becomes $$L=\sqrt{\sum_{i=1}^n (y_i-A_{i,1}x_1-A_{i,2}x_2-...-A_{i,n}x_n)^2}+\lambda \bigg({\sqrt {\sum_{i=1}^n x_i^2}-1}\bigg )$$ By taking partial derivatives $$\frac{\partial L}{\partial x_1}=\frac{\lambda x_1}{\sqrt{\sum_{i=1}^n x_i^2}}-\frac{\sum_{i=1}^nA_{i,1}(y_i-A_{i,1}x_i)}{\sqrt{\sum_{i=1}^n (y_i-A_{i,1}x_1-A_{i,2}x_2-...-A_{i,n}x_n)^2}}=0$$ $$\frac{\partial L}{\partial x_2}=\frac{\lambda x_2}{\sqrt{\sum_{i=1}^n x_i^2}}-\frac{\sum_{i=1}^nA_{i,2}(y_i-A_{i,2}x_i)}{\sqrt{\sum_{i=1}^n (y_i-A_{i,1}x_1-A_{i,2}x_2-...-A_{i,n}x_n)^2}}=0$$ ... $$\frac{\partial L}{\partial x_n}=\frac{\lambda x_n}{\sqrt{\sum_{i=1}^n x_i^2}}-\frac{\sum_{i=1}^nA_{i,n}(y_i-A_{i,n}x_i)}{\sqrt{\sum_{i=1}^n (y_i-A_{i,1}x_1-A_{i,2}x_2-...-A_{i,n}x_n)^2}}=0$$ $$\frac{\partial L}{\partial \lambda}=\sqrt {\sum_{i=1}^n x_i^2}-1=0$$ Now you have n+1 nonlinear equation which you can solve for n+1 variables by using some numerical method.

PS: I skipped the positivity of x's. You can hardcode it by Kuhn-Tucker conditions which can make the system too complicated; or you can check your solution set afterwards.

Thanks! (1) Is there a analytical optimal solution, without using numerical methods? (2) Is my problem a quadratic programming problem? (I think the constraints in a QP problem must be linear, but here $∥x∥=1$ is quadratic.) — Tim, Mar 16 '13 at 00:02
@steveO (1) I don't think you can have an analytic solution. Equations seem to be very complicated. (2) Your problem is kind of quadratically constrained quadratic problem. — AnilB, Mar 16 '13 at 00:54
This is what we teach optimization students not to do. Solving an optimization problem and solving the first-order optimality conditions as a set of equations and inequalities are only the same thing for convex problems. Many optimization methods also apply to nonconvex problems and encourages descent in the objective function. — Dominique, Aug 21 '16 at 18:44

Royi · Answer 6 · 2018-05-31T21:43:12.760

I would use the Projected Gradient Descend for this case.
Though the problem isn't Convex it will work nicely.

The algorithm is as following:

Calculate the Gradient at the current point.
Update the solution $ x = x - t {A}^{T} A x - {A}^{T} b $ where $ {A}^{T} A x - {A}^{T} b $ is the Gradient of the Objective Function at $ x $ and $ t $ is the step size.
Project the output of previous step into $ {\mathbb{R}}_{+} $ by $ {x}_{i} = \max \left\{ {x}_{i}, 0 \right\} $.
Project the output of previous step onto the Unit Sphere by $ {x}_{i} = \frac{ {x}_{i} }{ \left\| x \right\|_{2} } $.
Go back to (1) (Or check validity of the point, KKT will do even the problem isn't Convex).

In a simple 2D example I created it worked pretty well:

The code is available at my StackExchange Mathematics Q2699867 GitHub Repository.

Remark 001
I'd even considering starting with the solution of the Convex Problem when you replace the equality constraint $ \left\| x \right\|_{2} = 1 $ by $ \left\| x \right\|_{1} = 1 $. You can either use it as a starting point for the above algorithm or approximated solution by itself.

Remark 002
Another approach might be something like I did in - Solution for $ \arg \min_{ {x}^{T} x = 1} { x}^{T} A x - {c}^{T} x $.
Yet after each iteration of updating $ \lambda $ you should also project the output $ x $ into $ \mathbb{R}_{+} $.

Adapted from my answer at - https://math.stackexchange.com/a/2706507/33. — Royi, May 31 '18 at 21:46

Minimize $\| A x - b \|_{2}^{2}$ Subject To $\| x \|_2 = 1$ and $x \succeq 0$ (Least Squares with Inequality and Non Linear Equality of $ L_2 $ Norm)

6 Answers6

Linked