Sparse arrays (scipy.sparse)#

SciPy 2-D sparse array package for numeric data.

Note

This package is switching to an array interface, compatible with NumPy arrays, from the older matrix interface. We recommend that you use the array objects (bsr_array, coo_array, etc.) for all new work.

When using the array interface, please note that:

  • x * y no longer performs matrix multiplication, but element-wise multiplication (just like with NumPy arrays). To make code work with both arrays and matrices, use x @ y for matrix multiplication.

  • Operations such as sum, that used to produce dense matrices, now produce arrays, whose multiplication behavior differs similarly.

  • Sparse arrays use array style slicing operations, returning scalars, 1D, or 2D sparse arrays. If you need 2D results, use an appropriate index. E.g. A[:, i, None] or A[:, [i]].

The construction utilities (eye, kron, random, diags, etc.) have appropriate replacements (see Building sparse arrays).

For more information see Migration from spmatrix to sparray.

Submodules#

csgraph

Compressed sparse graph routines (scipy.sparse.csgraph)

linalg

Sparse linear algebra (scipy.sparse.linalg)

Sparse array classes#

bsr_array(arg1[, shape, dtype, copy, ...])

Block Sparse Row format sparse array.

coo_array(arg1[, shape, dtype, copy, maxprint])

A sparse array in COOrdinate format.

csc_array(arg1[, shape, dtype, copy, maxprint])

Compressed Sparse Column array.

csr_array(arg1[, shape, dtype, copy, maxprint])

Compressed Sparse Row array.

dia_array(arg1[, shape, dtype, copy, maxprint])

Sparse array with DIAgonal storage.

dok_array(arg1[, shape, dtype, copy, maxprint])

Dictionary Of Keys based sparse array.

lil_array(arg1[, shape, dtype, copy, maxprint])

Row-based LIst of Lists sparse array.

sparray()

This class provides a base class for all sparse arrays.

Building sparse arrays#

diags_array(diagonals, /, *[, offsets, ...])

Construct a sparse array from diagonals.

eye_array(m[, n, k, dtype, format])

Identity matrix in sparse array format

random_array(shape, *[, density, format, ...])

Return a sparse array of uniformly random numbers in [0, 1)

block_array(blocks, *[, format, dtype])

Build a sparse array from sparse sub-blocks

Combining arrays#

kron(A, B[, format])

kronecker product of sparse matrices A and B

kronsum(A, B[, format])

kronecker sum of square sparse matrices A and B

block_diag(mats[, format, dtype])

Build a block diagonal sparse matrix or array from provided matrices.

tril(A[, k, format])

Return the lower triangular portion of a sparse array or matrix

triu(A[, k, format])

Return the upper triangular portion of a sparse array or matrix

hstack(blocks[, format, dtype])

Stack sparse matrices horizontally (column wise)

vstack(blocks[, format, dtype])

Stack sparse arrays vertically (row wise)

Sparse tools#

save_npz(file, matrix[, compressed])

Save a sparse matrix or array to a file using .npz format.

load_npz(file)

Load a sparse array/matrix from a file using .npz format.

find(A)

Return the indices and values of the nonzero elements of a matrix

get_index_dtype([arrays, maxval, check_contents])

Based on input (integer) arrays a, determine a suitable index data type that can hold the data in the arrays.

safely_cast_index_arrays(A[, idx_dtype, msg])

Safely cast sparse array indices to idx_dtype.

Identifying sparse arrays#

issparse(x)

Is x of a sparse array or sparse matrix type?

Sparse matrix classes#

bsr_matrix(arg1[, shape, dtype, copy, ...])

Block Sparse Row format sparse matrix.

coo_matrix(arg1[, shape, dtype, copy, maxprint])

A sparse matrix in COOrdinate format.

csc_matrix(arg1[, shape, dtype, copy, maxprint])

Compressed Sparse Column matrix.

csr_matrix(arg1[, shape, dtype, copy, maxprint])

Compressed Sparse Row matrix.

dia_matrix(arg1[, shape, dtype, copy, maxprint])

Sparse matrix with DIAgonal storage.

dok_matrix(arg1[, shape, dtype, copy, maxprint])

Dictionary Of Keys based sparse matrix.

lil_matrix(arg1[, shape, dtype, copy, maxprint])

Row-based LIst of Lists sparse matrix.

spmatrix()

This class provides a base class for all sparse matrix classes.

Building sparse matrices#

eye(m[, n, k, dtype, format])

Sparse matrix with ones on diagonal

identity(n[, dtype, format])

Identity matrix in sparse format

diags(diagonals[, offsets, shape, format, dtype])

Construct a sparse matrix from diagonals.

spdiags(data, diags[, m, n, format])

Return a sparse matrix from diagonals.

bmat(blocks[, format, dtype])

Build a sparse array or matrix from sparse sub-blocks

random(m, n[, density, format, dtype, rng, ...])

Generate a sparse matrix of the given shape and density with randomly distributed values.

rand(m, n[, density, format, dtype, rng])

Generate a sparse matrix of the given shape and density with uniformly distributed values.

Combining matrices use the same functions as for Combining arrays.

Identifying sparse matrices#

issparse(x)

Is x of a sparse array or sparse matrix type?

isspmatrix(x)

Is x of a sparse matrix type?

isspmatrix_csc(x)

Is x of csc_matrix type?

isspmatrix_csr(x)

Is x of csr_matrix type?

isspmatrix_bsr(x)

Is x of a bsr_matrix type?

isspmatrix_lil(x)

Is x of lil_matrix type?

isspmatrix_dok(x)

Is x of dok_array type?

isspmatrix_coo(x)

Is x of coo_matrix type?

isspmatrix_dia(x)

Is x of dia_matrix type?

Warnings#

Usage information#

There are seven available sparse array types:

  1. csc_array: Compressed Sparse Column format

  2. csr_array: Compressed Sparse Row format

  3. bsr_array: Block Sparse Row format

  4. lil_array: List of Lists format

  5. dok_array: Dictionary of Keys format

  6. coo_array: COOrdinate format (aka IJV, triplet format)

  7. dia_array: DIAgonal format

To construct an array efficiently, use any of coo_array, dok_array or lil_array. dok_array and lil_array support basic slicing and fancy indexing with a similar syntax to NumPy arrays. The COO format does not support indexing (yet) but can also be used to efficiently construct arrays using coord and value info.

Despite their similarity to NumPy arrays, it is strongly discouraged to use NumPy functions directly on these arrays because NumPy typically treats them as generic Python objects rather than arrays, leading to unexpected (and incorrect) results. If you do want to apply a NumPy function to these arrays, first check if SciPy has its own implementation for the given sparse array class, or convert the sparse array to a NumPy array (e.g., using the toarray method of the class) before applying the method.

All conversions among the CSR, CSC, and COO formats are efficient, linear-time operations.

To perform manipulations such as multiplication or inversion, first convert the array to either CSC or CSR format. The lil_array format is row-based, so conversion to CSR is efficient, whereas conversion to CSC is less so.

Matrix vector product#

To do a vector product between a 2D sparse array and a vector use the matmul operator (i.e., @) which performs a dot product (like the dot method):

>>> import numpy as np
>>> from scipy.sparse import csr_array
>>> A = csr_array([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> v = np.array([1, 0, -1])
>>> A @ v
array([ 1, -3, -1], dtype=int64)

The CSR format is especially suitable for fast matrix vector products.

Example 1#

Construct a 1000x1000 lil_array and add some values to it:

>>> from scipy.sparse import lil_array
>>> from scipy.sparse.linalg import spsolve
>>> from numpy.linalg import solve, norm
>>> from numpy.random import rand
>>> A = lil_array((1000, 1000))
>>> A[0, :100] = rand(100)
>>> A.setdiag(rand(1000))

Now convert it to CSR format and solve A x = b for x:

>>> A = A.tocsr()
>>> b = rand(1000)
>>> x = spsolve(A, b)

Convert it to a dense array and solve, and check that the result is the same:

>>> x_ = solve(A.toarray(), b)

Now we can compute norm of the error with:

>>> err = norm(x-x_)
>>> err < 1e-10
True

It should be small :)

Example 2#

Construct an array in COO format:

>>> from scipy import sparse
>>> from numpy import array
>>> I = array([0,3,1,0])
>>> J = array([0,3,1,2])
>>> V = array([4,5,7,9])
>>> A = sparse.coo_array((V,(I,J)),shape=(4,4))

Notice that the indices do not need to be sorted.

Duplicate (i,j) entries are summed when converting to CSR or CSC.

>>> I = array([0,0,1,3,1,0,0])
>>> J = array([0,2,1,3,1,0,0])
>>> V = array([1,1,1,1,1,1,1])
>>> B = sparse.coo_array((V,(I,J)),shape=(4,4)).tocsr()

This is useful for constructing finite-element stiffness and mass matrices.

Further details#

CSR column indices are not necessarily sorted. Likewise for CSC row indices. Use the .sorted_indices() and .sort_indices() methods when sorted indices are required (e.g., when passing data to other libraries).