csr_matrix#
- class scipy.sparse.csr_matrix(arg1, shape=None, dtype=None, copy=False, *, maxprint=None)[source]#
Compressed Sparse Row matrix.
Warning
SciPy sparse is shifting from a sparse matrix interface to a sparse array interface. In the next few releases we expect to deprecate the sparse matrix interface. For documentation of the matrix interface, see the spmatrix interface docs. For guidance on converting existing code to sparse arrays, see Migration from spmatrix to sparray.
- This can be instantiated in several ways:
- csr_matrix(D)
where D is a 2-D ndarray
- csr_matrix(S)
with another sparse array or matrix S (equivalent to S.tocsr())
- csr_matrix((M, N), [dtype])
to construct an empty matrix with shape (M, N) dtype is optional, defaulting to dtype=’d’.
- csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])
where
data,row_indandcol_indsatisfy the relationshipa[row_ind[k], col_ind[k]] = data[k].- csr_matrix((data, indices, indptr), [shape=(M, N)])
is the standard CSR representation where the column indices for row i are stored in
indices[indptr[i]:indptr[i+1]]and their corresponding values are stored indata[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the matrix dimensions are inferred from the index arrays.
- Attributes:
- datandarray
CSR format data array of the matrix
- indicesndarray
CSR format index array of the matrix
- indptrndarray
CSR format index pointer array of the matrix
has_sorted_indicesboolWhether the indices are sorted
has_canonical_formatboolWhether the array/matrix has sorted indices and no duplicates
- dtypedtype
Data type of the matrix
shape2-tupleShape of the matrix
- ndimint
Number of dimensions (this is always 2)
formatstrFormat string for matrix.
nnzintNumber of stored values, including explicit zeros.
sizeintNumber of stored values.
Tcsr_matrixTranspose.
mTcsr_matrixMatrix transpose.
Methods
__len__()__mul__(other)__pow__(power)arcsin()Element-wise arcsin.
arcsinh()Element-wise arcsinh.
arctan()Element-wise arctan.
arctanh()Element-wise arctanh.
argmax([axis, out, explicit])Return indices of maximum elements along an axis.
argmin([axis, out, explicit])Return indices of minimum elements along an axis.
asformat(format[, copy])Return this array/matrix in the passed format.
asfptype()Upcast matrix to a floating point format (if necessary)
astype(dtype[, casting, copy])Cast the array/matrix elements to a specified type.
ceil()Element-wise ceil.
check_format([full_check])Check whether the array/matrix respects the CSR or CSC format.
conj([copy])Element-wise complex conjugation.
conjugate([copy])Element-wise complex conjugation.
copy()Returns a copy of this array/matrix.
count_nonzero([axis])Number of non-zero entries.
deg2rad()Element-wise deg2rad.
diagonal([k])Returns the kth diagonal of the array/matrix.
dot(other)Ordinary dot product.
Remove zero entries from the array/matrix.
expm1()Element-wise expm1.
floor()Element-wise floor.
getH()Return the Hermitian transpose of this matrix.
Get the shape of the matrix
getcol(j)Returns a copy of column j of the matrix, as an (m x 1) sparse matrix (column vector).
Matrix storage format
Maximum number of elements to display when printed.
getnnz([axis])Number of stored values, including explicit zeros.
getrow(i)Returns a copy of row i of the matrix, as a (1 x n) sparse matrix (row vector).
log1p()Element-wise log1p.
max([axis, out, explicit])Return the maximum of the array/matrix or maximum along an axis.
maximum(other)Element-wise maximum between this and another array/matrix.
mean([axis, dtype, out])Compute the arithmetic mean along the specified axis.
min([axis, out, explicit])Return the minimum of the array/matrix or maximum along an axis.
minimum(other)Element-wise minimum between this and another array/matrix.
multiply(other)Element-wise multiplication by another array/matrix.
nanmax([axis, out, explicit])Return the maximum, ignoring any Nans, along an axis.
nanmin([axis, out, explicit])Return the minimum, ignoring any Nans, along an axis.
nonzero()Nonzero indices of the array/matrix.
power(n[, dtype])This function performs element-wise power.
prune()Remove empty space after all non-zero elements.
rad2deg()Element-wise rad2deg.
reshape(self, shape[, order, copy])Gives a new shape to a sparse array/matrix without changing its data.
resize(*shape)Resize the array/matrix in-place to dimensions given by
shape.rint()Element-wise rint.
set_shape(shape)Set the shape of the matrix in-place
setdiag(values[, k])Set diagonal or off-diagonal elements of the array/matrix.
sign()Element-wise sign.
sin()Element-wise sin.
sinh()Element-wise sinh.
Sort the indices of this array/matrix in place.
Return a copy of this array/matrix with sorted indices.
sqrt()Element-wise sqrt.
sum([axis, dtype, out])Sum the array/matrix elements over a given axis.
Eliminate duplicate entries by adding them together.
tan()Element-wise tan.
tanh()Element-wise tanh.
toarray([order, out])Return a dense ndarray representation of this sparse array/matrix.
tobsr([blocksize, copy])Convert this array/matrix to Block Sparse Row format.
tocoo([copy])Convert this array/matrix to COOrdinate format.
tocsc([copy])Convert this array/matrix to Compressed Sparse Column format.
tocsr([copy])Convert this array/matrix to Compressed Sparse Row format.
todense([order, out])Return a dense representation of this sparse matrix.
todia([copy])Convert this array/matrix to sparse DIAgonal format.
todok([copy])Convert this array/matrix to Dictionary Of Keys format.
tolil([copy])Convert this array/matrix to List of Lists format.
trace([offset])Returns the sum along diagonals of the sparse array/matrix.
transpose([axes, copy])Reverses the dimensions of the sparse array/matrix.
trunc()Element-wise trunc.
__add__
__getitem__
__matmul__
__rmatmul__
__rmul__
__truediv__
Notes
Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.
- Advantages of the CSR format
efficient arithmetic operations CSR + CSR, CSR * CSR, etc.
efficient row slicing
fast matrix vector products
- Disadvantages of the CSR format
slow column slicing operations (consider CSC)
changes to the sparsity structure are expensive (consider LIL or DOK)
- Canonical Format
Within each row, indices are sorted by column.
There are no duplicate entries.
Examples
>>> import numpy as np >>> from scipy.sparse import csr_matrix >>> csr_matrix((3, 4), dtype=np.int8).toarray() array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=int8)
>>> row = np.array([0, 0, 1, 2, 2, 2]) >>> col = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_matrix((data, (row, col)), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]])
>>> indptr = np.array([0, 2, 3, 6]) >>> indices = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_matrix((data, indices, indptr), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]])
Duplicate entries are summed together:
>>> row = np.array([0, 1, 2, 0]) >>> col = np.array([0, 1, 1, 0]) >>> data = np.array([1, 2, 4, 8]) >>> csr_matrix((data, (row, col)), shape=(3, 3)).toarray() array([[9, 0, 0], [0, 2, 0], [0, 4, 0]])
As an example of how to construct a CSR matrix incrementally, the following snippet builds a term-document matrix from texts:
>>> docs = [["hello", "world", "hello"], ["goodbye", "cruel", "world"]] >>> indptr = [0] >>> indices = [] >>> data = [] >>> vocabulary = {} >>> for d in docs: ... for term in d: ... index = vocabulary.setdefault(term, len(vocabulary)) ... indices.append(index) ... data.append(1) ... indptr.append(len(indices)) ... >>> csr_matrix((data, indices, indptr), dtype=int).toarray() array([[2, 1, 0, 0], [0, 1, 1, 1]])