7

I understand the whole $LU$-decomposition vs Gaussian elimination argument. The fact that you can isolate the computationally expensive elimination step and re-use the $L$ and $U$ matrices for $Ax=b$ style equations with different $b$:s makes sense to me. But I can't seem to find a reason for why the $L$ and $U$ matrices are preferred over an $A^{-1}$ matrix. It can also be used for for multiple $b$:s. So that's my question, why is $LU$ preferred?

Erik
  • 193

3 Answers3

10

As other commenters have noted, it's $\mathcal{O}(n^3)$ operations to compute either $A^{-1}$ or an $LU$ decomposition and it's also $\mathcal{O}(n^2)$ operations to solve $Ax = b$ once either $A^{-1}$ or an $LU$ decomposition have been computed. So from this point of view, both approaches are equally difficult.

The next level of granularity for analysing matrix computations are flop counts, where we count the total number of floating point operations (additions, subtractions, multiplications, and divisions) as a function of $n$. Usually, we truncate this expression to its highest monomial term.

Going through the analysis, it takes $2n^3/3$ operations to compute an $LU$ factorization of $A$ and $2n^3$ operations to compute $A^{-1}$. Moreover, it costs $2n^2$ operations to solve $Ax = b$ either by triangular substitution from an $LU$ factorization or by multiplying by $A^{-1}$. So even for multiple right hand side problems where we want to solve $Ax = b$ for $m \gg n$ different values of $b$, computing $A^{-1}$ gives you no benefit over an $LU$ factorization. And for single right hand side problems, you've tripled the cost ($2n^3$ vs $2n^3/3$). Tripling the cost of an operation isn't a huge deal, but if you're going to make my code run at one third the speed, you should have a good reason. (If you're willing to accept a higher cost, you might as well solve $Ax = b$ by $QR$ factorization, which has superior stability properties due to the orthogonal and thus perfectly well-conditioned $Q$ matrix.)

A possible response might be: computing $A^{-1}$ is more accurate. But exactly the opposite is true: solving $Ax = b$ by computing $A^{-1}b$ is often much less accurate then computing $U^{-1}L^{-1}b$. The analysis is done in Higham's excellent monograph Accuracy and Stability of Numerical Algorithms, Section 14.1, where he also provides an example where solving $Ax = b$ in double precision produces a $\sim 10^6$ increase in the backwards error versus an $LU$ factorization (with partial pivoting).

There are some rare instances when computing $A^{-1}$ may be valuable, but for solving $Ax = b$, you're taking three times as long to produces an answer with a million times the error.

eepperly16
  • 7,712
  • Ok, so it has more to do with floating point operations rather than time complexity. Thank you! – Erik Jan 31 '20 at 18:14
  • I would say it's both. To be slightly glib, it's more time-consuming (but same big-O) and less accurate, so there's not a good reason to use it. – eepperly16 Jan 31 '20 at 18:37
  • If one is fine with only obtaining a factorization of the inverse (of elementary operations), then we can lower the computational cost to $4n^3/3$ (by simply doing two $LU$-factorizations), and having the same $2n^2$ cost for solving $Ax=b$. Accuracy would probably still be worse than $LU$-factorization though. – Sudix Feb 01 '20 at 03:43
  • @Erik you can also find an example related to the accuracy here. – Algebraic Pavel Feb 03 '20 at 18:10
4

How would we compute $A^{-1}$? We would need to solve $Ax = e_i$ for each standard basis vector $e_i$. And how would we do that? We wouldn't want to repeat the work of Gaussian elimination each time. So we would instead compute the LU factorization of $A$ for a one-time $O(n^3)$ cost, and use it to solve each system $Ax= e_i$ (for $i = 1, \ldots, n$). So we are going to compute the LU factorization of $A$ anyway. But once we have the LU factorization of $A$, there is no need for the further work of computing $A^{-1}$.

littleO
  • 54,048
  • Correct me if I'm wrong, but if we are always using the same $A$ then $A^{-1}$ should also always be the same? So there would be no need to re-calculate it for different $e_i$. All you would have to do is write $x = A^{-1}e_i$. And since matrix multiplication has the time complexity $O(n^2)$, there shouldn't be any difference with $LU$-factorization. – Erik Jan 31 '20 at 16:06
  • 1
    @Erik The $i$th column of $A^{-1}$ is the solution to $Ax = e_i$. You can't construct $A^{-1}$ without first solving $Ax = e_i$ for $i = 1, \ldots, n$. So how are we going to solve $Ax = e_i$? – littleO Jan 31 '20 at 20:50
  • If I solve $Ax=I$ (where $I$ is the identity matrix) I receive an $A^{-1}$ that can be used with any $e_i$ to find $x$ with just a simple multiplication. – Erik Feb 01 '20 at 17:00
  • @Erik One comment is that once you have $A^{-1}$, there is no need for any additional work to solve $Ax=e_i$. We would just look at the $i$th column of $A^{-1}$; that column is the solution to $Ax = e_i$. No need for even a simple multiplication. But the main issue is, how would we solve $AX= I$? Usually people would do that by first computing the LU factorization of $A$. – littleO Feb 01 '20 at 20:20
0

$L$ and $U$ are triagonal matrices. The cost of inverting triagonal matrices is much less than the cost of inverting a general matrix $A$.

Specifically, you can "invert" $L$ and $U$ using forward and backward substitution which has cost $\mathcal O(n^2)$, whereas the cost of inverting $A$ is $\mathcal O(n^4)$ (correct me if I'm wrong).

  • 3
    I think that the cost of finding $A^{-1}$ is $O(n^3)$, which is the same as finding the $LU$ matrices. However you only need to do this one time. Once you have the $LU$ or the $A^{-1}$ the cost of using both to find $x$ is $O(n^2)$ right? – Erik Jan 30 '20 at 22:03