4

The ordinary least squares (OLS) method is very useful. It gives you the solution to the problem

$$ \arg \min_{x} {\left\| A x - b \right\|}_{2}^{2} $$

Now, if the problem is the same, but the $1$-norm is used instead:

$$ \arg \min_{x} {\left\| A x - b \right\|}_{1} $$

Is there a known (approximate or not) solution to that problem? Any time efficient algorithm to get this optimum? I've read of the Theil-Sen estimator which should do the trick in dimension $2$, and some multidimentionnal extension of it, but the algorithm computation time increases hugely with dimension, I don't think I'll get any solution before a year if I use that.

Royi
  • 10,050
Pierre
  • 149
  • 1
    There is no known formula for the solution, as there is for least squares problems. Minimizing $| Ax - b |_1$ is a convex problem and can be solved with standard convex optimization algorithms. An easy way to do it is using the CVX or CVXPY software package, which will let you solve small instances of this problem (with a few hundred variables perhaps) using just a few lines of readable code. How large is your problem? – littleO Jul 10 '18 at 16:19
  • Is that supposed to be $\ell_1$ rather than $N_1$ in the title? – littleO Jul 10 '18 at 16:20
  • @littleO : I will investigate this. The data size is ~20 variables and ~1000 observations; but needs to be done in a loop, thus needs to be fast. Thanks! – Pierre Jul 10 '18 at 16:21
  • @littleO : im my maths curses, we used N1 as the name of the "norm 1", or the sum of absolute values. It might differ from country to country... I'll edit – Pierre Jul 10 '18 at 16:22
  • Ah, that's a small problem. CVX or CVXPY is a good thing to try at first. If it's not fast enough, you could formulate the problem as a linear program and use some good LP solver such as Mosek. – littleO Jul 10 '18 at 16:23
  • @littleO : for this project, I cannot use external compiled code. Are the source codes and/or algorithms available? – Pierre Jul 10 '18 at 16:25
  • So that means that you don't have an LP solver available that you can call? – littleO Jul 10 '18 at 16:26
  • @littleO : precisely... Very very complicated company -_- A linear programming engine is very easy to code, but the problem is : how do you get a decent descent (yeah punny) direction in a problem where the gradient is +/- 1? – Pierre Jul 10 '18 at 16:29
  • That's too bad, because with an existing LP solver you could write code to solve this problem efficiently in like five minutes. Without an LP solver to invoke, I suppose that you could implement your own LP solver. The textbook Boyd and Vandenberghe explains how to do this, for example. There are also first-order methods such as ADMM that you could implement -- the implementation might be easier, but the code would not be as fast, accurate, and reliable as it would be if you use an interior point method. – littleO Jul 10 '18 at 16:32
  • There are some open source LP solvers as well. – Michal Adamaszek Jul 10 '18 at 16:32
  • @littleO I'll have a look at this, thanks! – Pierre Jul 10 '18 at 16:33
  • I agree with @MichalAdamaszek that if you are allowed to use external source code, as long as it's not compiled, you might use an open source LP solver. CVXOPT is one example in Python. – littleO Jul 10 '18 at 16:59
  • Related: https://math.stackexchange.com/q/2141801/339790 – Rodrigo de Azevedo Jul 11 '18 at 14:24

3 Answers3

2

The problem is given by:

$$ \arg \min_{x} {\left\| A x - b \right\|}_{1} $$

If we define $ r = A x - b $ we can rewrite the problem as:

$$\begin{align*} \arg \min_{x, t} \quad & \boldsymbol{1}^{T} t \\ \text{subject to} \quad &-t \preceq A x - b \preceq t \end{align*}$$

This can be easily solved by any Linear Programming solver.

You may see some more related approaches in my answer to Mathematics Q1639716 - How Can $ {L}_{1} $ Norm Minimization with Linear Equality Constraints (Basis Pursuit / Sparse Representation) Be Formulated as Linear Programming?

Royi
  • 10,050
0

Not equivalent, but one approach is iteratively re-weighted least squares, where you solve a sequence of ordinary least squares problems

$$x_{i}= \min_x\{\|W_i(Ax-b)\|_2\}$$

And re-weighting means to calculate $W_{i+1}$ based on $x_i$

Basically a loop

  1. Calculate $W_k$ based on previous solution
  2. Calculate $x_k$
  3. If $\|x_k-x_{k-1}\|>\epsilon$ loop back to 1.
mathreadler
  • 26,534
  • Hello, sorry for late answer. I have actually already tried this algorithm without knowing its name, and it seems to be prone to non-convergence. I'll have a look, thanks. – Pierre May 17 '19 at 10:33
  • I have gotten it to work well enough for my purposes in 5 or 10 applications or so. – mathreadler May 17 '19 at 11:20
  • OK thanks. Definitely gonna have a look (but now it's for my own pleasure, as I changed of job ;-) ) – Pierre May 17 '19 at 13:41
0

The answer seems to be: The problem is not particularly hard to solve, but there are no standard solutions. Though there are many ways to do this, I feel like @Royi's anwer is most "correct", since it reduces the problem to a simple linear programming problem, as opposed to the more general IRLS-solution which can be used to solve much harder problems. However, @royi's answer is short and can be a bit confusing, so I wanted to elaborate a bit.

Firstly, why is $$ \arg \min_x \|Ax-b\|_1 $$ equivalent to $$ \begin{align*} \arg \min_{x, t} \quad & \boldsymbol{1}^{T} t \\ \text{subject to} \quad &-t \preceq A x - b \preceq t \end{align*} $$ ?

Because $\|Ax-b\|_1$ is the sum of the absolute values of the vector $r=Ax-b$, and, at the optimal solution, $t_{opt}=|r|$ (where i use $|\cdot|$ to denote the element-wise absolute value). It requires a little bit of thought, but you can probably convince yourself that $-t \preceq r \preceq t$ is equivalent to $|r| \preceq t$. So the elements of $t$ must be equal to or greater than the absolute value of the corresponding elements of $r$, but, since we are minimizing the sum of the elements of $t$, they must take the smallest value possible, therefore they are not greater than but equal to the elements of $|r|$ at the solution $t_{opt}$.

Secondly, how do we convert this into a standard linear programming problem, where we are optimizing for $x$?

According to the problem statement, the cost function $\mathbf{1}^Tt$ should be optimized for both $x$ and $t$, but the function only contains $t$. For a practical programming approach, we need to stack $x$ and $t$ into a single vector $q=[t^T\quad x^T]^T$ which will be the optimization target. We can then rewrite the objective function as $$ \arg\min_q \left[\begin{array}{c}\mathbf{1}\\ \mathbf{0}\end{array}\right]^Tq. $$ We also need to rewrite the constraints in terms of $q$. Treating the upper and lower bounds separately, we have $-t\preceq Ax-b \implies -t -Ax\preceq -b\implies [-I\quad -A]q\preceq -b$. Equivalently, we have $Ax-b\preceq t\implies Ax-t\preceq t\implies [-I\quad A]q \preceq b$.These can be combined to a single inequality $$ \left[\begin{array}{cc}-I & -A \\ -I & A\end{array}\right]q\preceq \left[\begin{array}{c}-b \\ b \end{array}\right] $$ This form of the objective function and constraints in terms of $q$ can be plugged into a linear programming solver, and the optimal $x$ can be extracted as the final elements of the optimal $q$.

The following python code does this using scipy.optimize.linprog:

import scipy.optimize as opt

def linear_minimum_mean_absolute_error(A, b): n = len(b) m = A.shape[1] c = np.zeros(n + m) c[0:n] = 1.0

B = np.concatenate((np.concatenate((-np.eye(n), -A), 1), 
                    np.concatenate((-np.eye(n), A), 1)), 0)
w = np.concatenate((-b, b))

res = opt.linprog(c, A_ub=B, b_ub=w, bounds = None)

return res.x[n:]

Gutskalk
  • 1
  • 1