Sparse matrices

Matrices with a small number of nonzero entries are called sparse matrices. Sparse matrices occur, for example, in scientific computing when describing discrete differential operators in the context of numerically solving partial differential equations.

Sparse matrices often have large dimensions, sometimes so large that the entire matrix (with zero entries) would not even fit in the available memory. This is one motivation for a special type for sparse matrices. Another motivation is better performance of operations where zero matrix entries can be avoided.

There are only a very limited number of algorithms for general, unstructured sparse matrices in linear algebra. Most of them are iterative in nature and based on efficient implementations of matrix-vector multiplication for sparse matrices.

Examples for sparse matrices are diagonal or banded matrices. The simple pattern of these matrices allows straightforward storing strategies; the principal diagonal and the sub- and super-diagonals are stored in 1D arrays. Conversion from a sparse representation to the classical array type and vice-versa can be done by the command diag.

In general, there is not such a simple structure and the description of sparse matrices requires special techniques and standards. Here we present a row and a column oriented type for sparse matrices, both available through the module scipy.sparse .

Sparse matrices

Figure 5.2: A stiffness matrix from a finite element model of an elastic plate. The pixels denote nonzero entries in the 1250 × 1250 matrix

Sparse matrix formats

The scipy.sparse module provides many different storing formats from sparse matrices. We describe here only the most important ones: CSR, CSC, and LIL. The LIL format should be used for generating and altering sparse matrices; CSR and CSC are efficient formats for matrix-matrix and matrix-vector operations.

Compressed sparse row

The compressed sparse row format (CSR) uses three arrays: data, indptr, and indices:

  • The 1D array data stores all the nonzero values in order. It has as many elements as there are nonzero elements, often denoted by the variable nnz.
  • The 1D array indptr contains integers such that indptr[i] is the index of the element in data, which is the first nonzero element of row i. If the entire row i is zero, then indptr[i]==indptr[i+1]. If the original matrix has m rows, then len(indptr)==m+1.
  • The 1D array indices contains the column index information in such a way that indices[indptr[i]:indptr[i+1]] is an integer array with the column indexes of the nonzero elements in row i. Obviously, len(indices)==len(data)==nnz.

Let's see an example: The CSR format of the matrix:

Compressed sparse row

is given by the three arrays:

data = (1. 2. 3. 4.)
indptr = (0 2 2 3 5)
indices = (0 2 0 0 3)

The module scipy.sparse provides a type, csr_matrix, with a constructor, which can be used in the following ways:

  • With a 2D array as argument
  • With a matrix in one of the other sparse formats in scipy.sparse
  • With a shape argument, (m,n), to generate a zero matrix in CSR format
  • By a 1D array for the data and an integer array ij with the shape (2,len(data)) such that ij[0,k] is the row index and ij[1,k] is the column index of data[k] of the matrix
  • The three arguments, dataindptr, and indices, can be given to the constructor directly

The first two options are there for conversion purposes while the last two directly define the sparse matrix.

Consider the above example in python look like:

import scipy.sparse as sp
A = array([[1,0,2,0],[0,0,0,0],[3.,0.,0.,0.],[1.,0.,0.,4.]])
AS = sp.csr_matrix(A)

Among others, the following attributes are provided:

AS.data      # returns array([ 1., 2., 3., 1., 4.]) 
AS.indptr    # returns array([0, 2, 2, 3, 5])
AS.indices   # returns array([0, 2, 0, 0, 3])
AS.nnz       # returns 5

Compressed Sparse Column

The CSR format has a column oriented twin - the compressed sparse column (CSC) format. The only difference in it compared to the CSR format is the definition of the indptr and indices arrays, which are now column-related. The type for the CSC format is csc_matrix and its use corresponds to csr_matrix, explained previously in this section.

Continuing the same example in CSC format:

import scipy.sparse as sp
A = array([[1,0,2,0],[0,0,0,0],[3.,0.,0.,0.],[1.,0.,0.,4.]])
AS = sp.csc_matrix(A)
AS.data         # returns array([ 1., 3., 1., 2., 4.]) 
AS.indptr       # returns array([0, 3, 3, 4, 5])
AS.indices      # returns array([0, 2, 3, 0, 3])
AS.nnz          # returns 5

Row-based linked list format

The linked list sparse format stores the nonzero matrix entries rowwise in a list data such that data[k] is a list of the nonzero entries in row k. If all entries in that row are 0, it contains an empty list.

A second list, rows, contains at position k a list of column indexes of the nonzero elements in row k.  Here is an example in Row-Based linked List Format (LIL) format:

import scipy.sparse as sp
A = array([[1,0,2,0],[0,0,0,0], [3.,0.,0.,0.], [1.,0.,0.,4.]]) 
AS = sp.lil_matrix(A)
AS.data     # returns array([[1.0, 2.0], [], [3.0], [1.0, 4.0]], dtype=object)
AS.rows     # returns array([[0, 2], [], [0], [0, 3]], dtype=object)
AS.nnz      # returns 5

Altering and slicing matrices in LIL format

The LIL format is the one best suited for slicing, that is, extracting submatrices in LIL format, and for changing the sparsity pattern by inserting nonzero elements. Slicing is demonstrated by the next example:

BS = AS[1:3,0:2]
BS.data     # returns array([[], [3.0]], dtype=object)
BS.rows     # returns array([[], [0]], dtype=object)

Insertion of a new nonzero element automatically updates the attributes:

AS[0,1] = 17 
AS.data # returns array([[1.0, 17.0, 2.0], [], [3.0], [1.0, 4.0]])
AS.rows              # returns array([[0, 1, 2], [], [0], [0, 3]])
AS.nnz               # returns 6

These operations are discouraged in the other sparse matrix formats as they are extremely inefficient.

Generating sparse matrices

The NumPy commands eye, identity, diag, and rand have their sparse counterparts. They take an additional argument; it specifies the sparse matrix format of the resulting matrix.

The following commands generate the identity matrix but in different sparse matrix formats:

import scipy.sparse as sp
sp.eye(20,20,format = 'lil') 
sp.spdiags(ones((20,)),0,20,20, format = 'csr') 
sp.identity(20,format ='csc')

The sp.rand command takes an additional argument describing the density of the generated random matrix. A dense matrix has density 1 while a zero matrix has density 0:

import scipy.sparse as sp 
AS=sp.rand(20,200,density=0.1,format=’csr’)
AS.nnz # returns 400

There is no direct correspondence to the NumPy command zeroes. Matrices completely filled with zeros are generated by instantiating the corresponding type with the shape parameters as constructor parameters:

import scipy.sparse as sp
Z=sp.csr_matrix((20,200))
Z.nnz    # returns 0

Sparse matrix methods

There are methods to convert one sparse type into another or into an array:

AS.toarray # converts sparse formats to a numpy array 
AS.tocsr
AS.tocsc
AS.tolil

The type of a sparse matrix can be inspected by the methods issparseisspmatrix_lil, isspmatrix_csr, and isspmatrix_csc.

Elementwise operations +, *, /, and ** on sparse matrices are defined as for NumPy arrays. Regardless of the sparse matrix format of the operands, the result is always a csr_matrix. Applying elementwise operating functions to sparse matrices requires first transforming them to either CSR or CSC format and applying the functions to their data attribute, as demonstrated by the next example.

The elementwise sine of a sparse matrix can be defined by an operation on its data attribute:

import scipy.sparse as sp
def sparse_sin(A):
    if not (sp.isspmatrix_csr(A) or sp.isspmatrix_csc(A)):
        A = A.tocsr()
A.data = sin(A.data)
return A

For matrix-matrix or matrix-vector multiplications, there is a sparse matrix method, dot. It returns either a csr_matrix or a 1D NumPy array:

import scipy.sparse as sp
A = array([[1,0,2,0],[0,0,0,0],[3.,0.,0.,0.],[1.,0.,0.,4.]])
AS = sp.csr_matrix(A)
b = array([1,2,3,4])
c = AS.dot(b)      # returns array([ 7., 0., 3., 17.]) 
C = AS.dot(AS)     # returns  csr_matrix
d = dot(AS,b)      # does not return the expected result! 

Tip

Avoid using NumPy's command dot on sparse matrices, as this might lead to unexpected results. Use the command dot from scipy.sparse instead.

Other linear algebra operations such as system solving, least squares, eigenvalues, and singular values are provided by the scipy.sparse.linalg module.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.19.185