16

I have an assignment for my Numerical Methods class to write a function that finds the PA=LU decomposition for a given matrix A and returns P, L, and U.

Nevermind the coding problems for a moment; there is a major mathematical problem I'm having that my professor seems incapable of addressing, and after searching for hours (perhaps inefficiently) I could find no accounting of it. How do we extricate the permutation matrices from the row elimination matrices?

Essentially the idea, if I understand it correctly, is that we perform a series of transformations on a matrix $A$ by applying successive lower triangular matrices that eliminate single elements, thus $L_nL_{n-1}...L_2L_1A = U$. In my understanding, this is computationally useful because lower triangular atomic matrices can be inverted by changing the sign of the off-diagonal element, so $A = L_1^{-1}L_2^{-1}...U$.

That's all fine (assuming I'm correct), but the introduction of pivot matrices between each $L_j$ seems to make the problem intractable. In every accounting I've seen some sorcery occurs that looks like this: $$L_nP_nL_{n-1}...P_3L_2P_2L_1P_1A = U \Rightarrow P_nP_{n-1}...P_2P_1L_nL_{n-1}...L_2L_1A = U$$

And no one bothers to explain how this happens or in fact even states it explicitly.

If possible I would like to know
a) Is this operationally acceptable?

b) What properties of these respective classes of matrices make this kind of willy-nilly commutation legal?

c) Is my understanding of the method and its advantages accurate?

BenL
  • 1,031
  • A well asked question, +1. – Calle Dec 04 '16 at 14:28
  • 2
    You might draw some confidence in such a manipulation from the observation that if all the rows were sorted in advance so that the (partial) pivoting "finds" the location of largest column entries in descending order, then the factorization would appear much as you describe. – hardmath Dec 04 '16 at 14:31
  • Usually, one tries to find the factorization as $A=P·L·U$. In your formula, it looks like $A=L^{-1}·P^{-1}·U$, where you might find that the factors are not commutative in that reordering. – Lutz Lehmann Dec 04 '16 at 14:34
  • 1
    I was specifically as to find PA=LU. And not to be difficult, but I would like rigorous arguments along with appeals to intuition when possible:) – BenL Dec 04 '16 at 14:50
  • It would be worth adding to the opening of your Question that you were asked to find $PA = LU$. You start using those symbols without giving them a concrete definition. – hardmath Dec 04 '16 at 15:47
  • 1
    Olver and Shakiban, in Example 1.12 of Chapter 1 of their Applied Linear Algebra, show how to find a $PA=LU$ decomposition of a matrix without messing around with long products of elementary matrices. This is far from a rigorously written proof, but with some mathematical experience you should be able to extract one from their example. I would be happy to know a text that gives a rigorous argument; this is one of many blind spots in the coverage of linear algebra (applied texts don't care enough about rigor; pure texts don't care enough about $PA=LU$). – darij grinberg Dec 04 '16 at 23:25
  • I don't have the book, can't find an up to date PDF version, and will not have access to an academic library until after this is due. Thanks though – BenL Dec 04 '16 at 23:40
  • 2
    Okay. Are you aware of http://gen.lib.rus.ec/ ? :) Otherwise, I've just realized there seems to be a rigorous source for such algorithms online: the book "Algorithmic Linear Algebra" by Herbert Möller ( https://wwwmath.uni-muenster.de/u/mollerh/pages/linalgebra.html ). The best available version is in German, but the $PA=LU$ decomposition is contained in the part that was translated into English; see Theorem 1.5.18 in https://wwwmath.uni-muenster.de/u/mollerh/data/ALAEng.pdf . Caveat lector: I have not read that book. – darij grinberg Dec 05 '16 at 00:04
  • If you end up with a readable rigorous writeup of the method used by Olver/Shakiban, please share it with the world! What Möller does is different (more in line with Calle's answer). – darij grinberg Dec 05 '16 at 00:08
  • The link you provided won't help me get a working algorithm running in time (I have an operational one built off of psuedocode my professor created that has no obvious connection to the mathematics) but it is a much clearer explanation of the math than I had seen before. Thanks! – BenL Dec 05 '16 at 13:55
  • So it seems that Herbert Möller's website is no longer available. Well, Library Genesis is: http://gen.lib.rus.ec/search.php?req=herbert+m%C3%B6ller&lg_topic=libgen&open=0&view=simple&res=25&phrase=0&column=def – darij grinberg Dec 17 '17 at 04:45

3 Answers3

7

No, in general permutation matrices do not commute.

It seems like you are using the Doolittle algorithm, so I am going to assume that indeed you are.

Let $U_i$ be the $i$:th step in your LU decomposition of $A$, so $$\begin{align} U_0 &= A \\ U_1 &= L_1P_1U_0 \\ \vdots \\ U_n &= L_nP_nU_{n-1} \end{align}$$

This corresponds well to how one would do LU-decomposition by hand; get the largest element as the leading element on the row you are at (i.e. multiply with $P_k$), then eliminate that column on the following rows (i.e. multply with $L_k$).

As you remark, the $L_i$ will be atomic lower triangular matrices, the non-zero elements all being in column $i$. The inverse of $L_i$ can be constructed by negating all off-diagonal elements, see Wikipedia.

The permutation matrix $P_j$ will be a permutation matrix switching row $j$ with a row $k \geq j$, if multiplied on the left of the matrix you want to transform. The inverse to $P_j$ is $P_j$ itself (since $P_j$ switches row $j$ with row $k$, you can undo this by doing the same thing).

Consider the product $P_jL_i$ for $i < j$. $P_j$ will switch two rows of $L_i$, row $j$ and $k \geq j > i$. We switch elements in $L_i$ as follows: $$\begin{align} L_{j,i} &\leftrightarrow L_{k,i} \\ L_{j,j} &\leftrightarrow L_{k,j} \end{align}$$ Let $L_k'$ be the matrix produced by switching just the off-diagonal elements (the first switch above). Note that this is still an atomic lower triangular matrix. We can then produce $P_jL_i$ by just switching column $j$ with column $k$ in $L_k'$, which is multiplying by $P_j$ on the right. Here it is important that $i < j, k$, so column $i$ in $L_i'$ is not changed! In other words: $$P_j L_i = L_i' P_j$$

Thus, you can from your equation $$L_nP_nL_{n-1}...P_3L_2P_2L_1P_1A = U$$ produce $$L_n^S L_{n-1}^S \cdots L_1^S P_n P_{n-1} \cdots P_1A = U$$ which can be transformed to (note that $L_i^S$ is still atomic lower triangular): $$PA = LU$$ taking $$P = P_nP_{n-1} \cdots P_1$$ and $$L = (L_n^S L_{n-1}^S \cdots L_1^S)^{-1}.$$

Here, $L_i^S$ is the matrix made by taking $L_i$ and applying all $P_j$ (on the left) for $j > i$ on the off-diagonal elements.

You do this by doing the following, starting with $$L_nP_nL_{n-1}...P_3L_2P_2L_1P_1A = U$$ move the $P_2$ matrix to the right using $P_2L_1 = L_1' P_2$, producing: $$L_nP_nL_{n-1}...P_3L_2L_1'P_2P_1A = U$$ then do the same for the matrix $P_3$, but this matrix you have to move to the right twice, using $P_3L_2 = L_2'P_3$ and $P_3L_1' = L_1''P_3$: $$L_nP_nL_{n-1}...L_2'L_1''P_3P_2P_1A = U$$ and so on for every $P_j$.

As an example of $P_j L_i = L_i' P_j$ consider $$P = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{pmatrix}$$ $$L = \begin{pmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 3 & 0 & 1 \end{pmatrix}$$ $$L' = \begin{pmatrix} 1 & 0 & 0 \\ 3 & 1 & 0 \\ 2 & 0 & 1 \end{pmatrix}$$

$$PL = \begin{pmatrix} 1 & 0 & 0 \\ 3 & 0 & 1 \\ 2 & 1 & 0 \end{pmatrix}$$ $$L'P = \begin{pmatrix}1 & 0 & 0 \\ 3 & 0 & 1 \\ 2 & 1 & 0 \end{pmatrix}$$

So, to summarize: These special matrices almost commute, only small changes are needed to swap the matrices. However, all important properties of the matrices are preserved are when swapping them.

Calle
  • 7,857
  • Thanks for this; I still don't quite follow though; I see that $PL=L'P$, but is that helpful? $PL$ isn't lower triangular and so can't be easily inverted. I also don't quite how you get to the expression involving $L''$, what those matrices look like, and whether they're lower triangular. – BenL Dec 04 '16 at 16:23
  • @BenL, I'll try to explain it better then. – Calle Dec 04 '16 at 17:26
  • @BenL, I have expanded my answer slightly. I also changed some syntax, the matrices I previously called $L_i''$ is now $L_i^S$. The matrices $PL$ or $L'P$ does not need to be triangular, but you need to collect the lower triangular matrices together, since a product of lower triangular matrices will be lower triangular. – Calle Dec 04 '16 at 17:36
  • @CalleThat's hugely helpful. In terms of algorithm construction (which I know is something of a separate question) I built functions that permute the rows and another that then eliminates the columns by building the appropriate matrices, applying them to $U_i$ and storing those matrices individually. Can I simply use those matrices inverted and with the order switched as my L and P? Or do I have to change the Ls somehow? – BenL Dec 04 '16 at 18:22
  • You have to make sure to apply the permutations to $L$ as well as $U$; and only on the non-diagonal elements of $L$. – Calle Dec 04 '16 at 18:30
  • And can those permutations be applied as I go? As I construct each L_j just switch the elements necessary before I store it? – BenL Dec 04 '16 at 18:38
  • Yes, I think so. But remember that the permutations constructed later also needs to be applied to the earlier $L_i$'s. – Calle Dec 04 '16 at 18:40
  • @Calle. The inverse of $P_j$ is not $P_j$. It is $P_j^T$ – Herman Jaramillo Jan 26 '19 at 20:17
  • 2
    @HermanJaramillo, for general permutation matrices $P$ we have $P^{-1} = P^T$. But $P_j$ is a special permutation matrix that only switches a pair of rows, so in this case the inverse, the transpose and the original matrix will all be the same ($P_j$ will have a one in position $j, k$ and a one in position $k, j$ - the other ones will be on the diagonal - transposing it will not change it). – Calle Jan 26 '19 at 20:43
  • @Calle : Thanks for the clarification. That is important. If $P_j$ is a simple (order 1? composed, not c) permutation it is equal to its transpose. – Herman Jaramillo Jan 26 '19 at 22:38
  • @HermanJaramillo, yes, a permutation that is a 2-cycle (a 2-cycle is also called a transposition, which might be confusing in this context), i.e. it only switches two elements, leaving the others as they are, is its own inverse. As explained in the article on Doolittle's algorithm I linked, $P_j$ just interchanges 2 rows, and therefore will be a 2-cycle. – Calle Jan 26 '19 at 23:26
2

I found an answer based on matrix algebra rather than components here. Here's my attempt to reproduce it with some more detail.

Starting from $$L_3 P_3 L_2 P_2 L_1 P_1 A = U$$ Define $$\Lambda_1 = P_3 P_2 L_1 P_2^{-1} P_3^{-1}$$ $$\Lambda_2 = P_3 L_2 P_3^{-1}$$ $$\Lambda_3 = L_3$$

Calculating $$\Lambda_3 \Lambda_2 \Lambda_1 P_3 P_2 P_1 = (L_3)(P_3 L_2 P_3^{-1})(P_3 P_2 L_1 P_2^{-1} P_3^{-1}) P_3 P_2 P_1 = L_3 P_3 L_2 P_2 L_1 P_1$$ We can swap this into our starting formula to get $$\Lambda_3 \Lambda_2 \Lambda_1 P_3 P_2 P_1 A = U$$ And letting $\Lambda = \Lambda_1^{-1} \Lambda_2^{-1} \Lambda_3^{-1}$ and $P = P_3 P_2 P_1$ we have $P A = \Lambda U$.

Now it remains to show that the $\Lambda_i$ are lower triangular and to answer your question about commuting head on. Notice that the $P_i$ only affect rows $i$ and below because the elimination algorithm has completed its work with the rows above $i$. The inverses of the $P_i$ will therefore also only need to affect those rows to unscramble things. Examining $\Lambda_1 = P_3 P_2 L_1 P_2^{-1} P_3^{-1}$ we see that $\Lambda_1$ swaps around rows 2 and below, applies $L_1$ to add multiples of the first row to those lower rows, then unswaps the rows. This is equivalent to adding appropriate multiples of $L_1$ to the original, unswapped rows, which we know is an operation described by a lower triangular matrix. Therefore the matrix $\Lambda_1$ which represents the aggregate operation $P_3 P_2 L_1 P_2^{-1} P_3^{-1}$ must be lower triangular.

In summary, the matrices don't commute as you had asked in your original question, but we can change an $L_i$ into a related vector $\Lambda_i$ by left- and right-multiplying by some permutation matrices. Because the commuting fails, our formula for $\Lambda$ isn't particularly pretty: $$\Lambda = P_3 P_2 L_1^{-1} P_2^{-1} L_2^{-1} P_3^{-1} L_3^{-1}$$ However, we can get a neater conclusion if we return to $$L_3 P_3 L_2 P_2 L_1 P_1 A = U$$ and let $$M = L_3 P_3 L_2 P_2 L_1 P_1$$ so that $M A = U$. Then because $P A = \Lambda U$ we must have $P = \Lambda M$ or $\Lambda = P M^{-1}$. So we can calculate $\Lambda$ from our original $M$ just by applying the composition of our permutations to $M^{-1}$, which isn't quite as good as commuting but isn't half bad either.

kuzzooroo
  • 611
2

$\require{color}$

I found @Calle explanation interesting but have a hard time following the arguments. @kuzooroo explanation fits better my way of thinking but the arguments are too dense. I decided to start a new answer since my arguments will not fit in a comment slot.

Following @Calle: Note that, since each $P_i$ is a one-cycle permutation (one interchange), $P_i^{-1}=P_i$ since doing an interchange twice bring the matrix to the same original form. We now should prove that $\Lambda$ is a lower triangular matrix with ones in the diagonal.

$\Lambda_3$ is, of course, a lower triangular with 1 in the diagonal. To prove that $\Lambda_2$ is lower triangular with ones in the diagonal we take into account the fact that $L_2$ is atomic. That is,

\begin{eqnarray*} L_2 = \begin{pmatrix} 1 & 0 & \cdots & 0 & \cdots & \cdots &\cdots & \cdots & 0 \\ 0 & 1 & \cdots & 0 & \cdots & \cdots &\cdots & \cdots & 0 \\ \vdots & l_{32} & \ddots & \vdots & \vdots &\vdots & \vdots & \vdots & \vdots\\ \vdots & \vdots & \cdots & 1 & 0 & \cdots &\cdots& \cdots & \vdots \\ \vdots & l_{i2} & \cdots & 0 & 1 & 0 &\cdots & \cdots & \vdots \\ \vdots & \vdots & \cdots & \vdots & \ddots & \ddots & \ddots & \cdots & \vdots \\ \vdots & l_{k2} & \vdots & \vdots & \vdots & 0 & 1 & 0 & \vdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots &\ddots & \vdots & 0 \\ 0 & 0 & \cdots & 0 & \cdots & \cdots &\cdots & 0 & 1 \end{pmatrix} \nonumber. \end{eqnarray*}

Let us assume that $P_3$ interchanges rows $i$ with $k$. (note that both $i$ and $k$ are larger than 2). We have that

\begin{eqnarray*} P_3 L_2 = \begin{pmatrix} 1 & 0 & \cdots & 0 & \cdots & \cdots &\cdots & \cdots & 0 \\ 0 & 1 & \cdots & 0 & \cdots & \cdots &\cdots & \cdots & 0 \\ \vdots & l_{32} & \ddots & \vdots & \vdots &\vdots & \vdots & \vdots & \vdots\\ \vdots & \vdots & \cdots & 1 & 0 & \cdots &\cdots& \cdots & \vdots \\ \vdots & \boxed{l_{k2}} & \vdots & \vdots & \textcolor{green}{0} & 0 & \textcolor{blue}{1} & 0 & \vdots \\ \vdots & \vdots & \cdots & \vdots & \ddots & \ddots & \ddots & \cdots & \vdots \\ \vdots & \boxed{l_{i2}} & \cdots & 0 & \textcolor{green}{1} & 0 & \textcolor{blue}{0} & \cdots & \vdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots &\ddots & \vdots & 0 \\ 0 & 0 & \cdots & 0 & \cdots & \cdots &\cdots & 0 & 1 \end{pmatrix} \nonumber. \end{eqnarray*}

Finally $P_3 L_2 P_3^{-1}= P_3 L_2 P_3$ ineterchanges columns $i$ (green) with $k$ (blue) on $P_3 L_2$. The $1$ in column $k$ (blue) takes the place of the 0 in column $i$ (green) . In the same way the $1$ in green in row $k$ takes the position of the blue $0$ on the same row.

That is

\begin{eqnarray*} P_3 L_2 P_3^{-1} = \begin{pmatrix} 1 & 0 & \cdots & 0 & \cdots & \cdots &\cdots & \cdots & 0 \\ 0 & 1 & \cdots & 0 & \cdots & \cdots &\cdots & \cdots & 0 \\ \vdots & l_{32} & \ddots & \vdots & \vdots &\vdots & \vdots & \vdots & \vdots\\ \vdots & \vdots & \cdots & 1 & 0 & \cdots &\cdots& \cdots & \vdots \\ \vdots & \boxed{l_{k2}} & \vdots & \vdots & \textcolor{green}{1} & 0 & \textcolor{blue}{0} & 0 & \vdots \\ \vdots & \vdots & \cdots & \vdots & \ddots & \ddots & \ddots & \cdots & \vdots \\ \vdots & \boxed{l_{i2}} & \cdots & 0 & \textcolor{green}{0} & 0 & \textcolor{blue}{1} & \cdots & \vdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots &\ddots & \vdots & 0 \\ 0 & 0 & \cdots & 0 & \cdots & \cdots &\cdots & 0 & 1 . \end{pmatrix} \nonumber. \end{eqnarray*}e

This is, again, an atomic matrix with the entries $l_{i2}$ and $l_{k2}$ interchanged. In the same way, $P_2 L_1 P_2^{-1}$ is atomic and so it is $\Lambda_1 = P_3 P_2 L_1 P_2^{-1} P_3^{-1}$,

In general for

\begin{eqnarray*} U= L_{n-1} P_{n-1} L_{n-2} P_{n-2} \cdots P_2 L_1 P_1 A, \end{eqnarray*} we define

\begin{eqnarray*} \Lambda_i = P_{n-1} \cdots P_{i+2} P_{i+1} L_i P_{i+1}^{-1} P_{i+2}^{-1} \cdots P_{n-1}^{-1}. \end{eqnarray*} where we assume $P_0=I$. The same procedure shown above guarantee that $\Lambda_i$ is atomic and so $\Lambda=\prod \Lambda_i$ is lower triangular with ones in the diagonal.

Hence this completes what is needed to verify that $PA=LU$.

egreg
  • 244,946
  • 2
    The standard MathJax syntax for coloring an object is \color{green}{x} (yes, that's it!). However if you add \require{color} in one of your math formulas, you can use the more proper \textcolor{green}{x} syntax. – egreg Jan 28 '19 at 10:05
  • Thanks @greg . I new I had an error and did not know how to fix it. That is helpful. – Herman Jaramillo Jan 28 '19 at 19:09