3

I am designing an algorithm that solves a linear system using the QR factorization, and the matrices I am dealing with are sparse and very large ($6000 \times 6000$). In order to improve the efficiency of the algorithm, I am trying to exploit the sparsity of the matrix by finding its bandwidth, but I have to run through the matrix a lot of times to find it, and it is taking too long.

The main idea I am using to find the bandwidth is:

  • for each row, find the start(row) and end(row): these are the intervals in which the elements are different of $0$;

  • to find start(row): iterate from the beginning of the row until the element is not $0$;

  • to find end(row): iterate from the end of the row until the element is not $0$;

The problem is that I am running through many unnecessary $0$'s, but I can not figure out how to avoid this and guarantee a solid result. Thanks.

1 Answers1

1

As a sparse matrix is mostly made of zeros. Using a 2-dimensional array for all elements will be an inefficient way to represent such data as more than half of the array will be zeroes which is the reason for the increased time cost for finding bandwidth in your case.

In your case if you have a matrix $M$ of size $n * m$ then you'll be using a 2 dimensional array of size $n*m$. However it is not necessary to do so

A more efficient way will be to represent only the non zero elements using a 2 dimensional array with only three columns.e.g.

------------------------
 R    |   C    | V
------------------------
 0    |  0     |  1
------------------------
 0    |  3     |  4
------------------------
 1    |  5     |  11
------------------------
 .    |  .     |  .
------------------------
 .    |  .     |  .
------------------------
 .    |  .     |  .

Where R is the row of the nonzero element,C is its column and V is its value in a given matrix.

In this way if you have $t$ nonzero elements in the Matrix then you need to access only $3*t$ elements in the above 2-dimensional array.

Now you can find the maximum and minimum values of C for any row R and can calculate their difference.

The largest difference you'll come across will be the bandwidth.I hope it makes sense.