scipy.sparse.

random_array#

scipy.sparse.random_array(shape, *, density=0.01, format='coo', dtype=None, rng=None, data_sampler=None)[source]#

Return a sparse array of uniformly random numbers in [0, 1)

Returns a sparse array with the given shape and density where values are generated uniformly randomly in the range [0, 1).

Parameters:
shapeint or tuple of ints

shape of the array

densityreal, optional (default: 0.01)

density of the generated matrix: density equal to one means a full matrix, density of 0 means a matrix with no non-zero items.

formatstr, optional (default: ‘coo’)

sparse matrix format.

dtypedtype, optional (default: np.float64)

type of the returned matrix values.

rng{None, int, numpy.random.Generator}, optional

If rng is passed by keyword, types other than numpy.random.Generator are passed to numpy.random.default_rng to instantiate a Generator. If rng is already a Generator instance, then the provided instance is used. Specify rng for repeatable function behavior.

If this argument is passed by position or random_state is passed by keyword, legacy behavior for the argument random_state applies:

  • If random_state is None (or numpy.random), the numpy.random.RandomState singleton is used.

  • If random_state is an int, a new RandomState instance is used, seeded with random_state.

  • If random_state is already a Generator or RandomState instance then that instance is used.

Changed in version 1.15.0: As part of the SPEC-007 transition from use of numpy.random.RandomState to numpy.random.Generator, this keyword was changed from random_state to rng. For an interim period, both keywords will continue to work, although only one may be specified at a time. After the interim period, function calls using the random_state keyword will emit warnings. The behavior of both random_state and rng are outlined above, but only the rng keyword should be used in new code.

This random state will be used for sampling indices (the sparsity structure), and by default for the data values too (see data_sampler).

data_samplercallable, optional (default depends on dtype)

Sampler of random data values with keyword arg size. This function should take a single keyword argument size specifying the length of its returned ndarray. It is used to generate the nonzero values in the matrix after the locations of those values are chosen. By default, uniform [0, 1) random values are used unless dtype is an integer (default uniform integers from that dtype) or complex (default uniform over the unit square in the complex plane). For these, the rng is used e.g. rng.uniform(size=size).

Returns:
ressparse array

Examples

Passing a np.random.Generator instance for better performance:

>>> import numpy as np
>>> import scipy as sp
>>> rng = np.random.default_rng()

Default sampling uniformly from [0, 1):

>>> S = sp.sparse.random_array((3, 4), density=0.25, rng=rng)

Providing a sampler for the values:

>>> rvs = sp.stats.poisson(25, loc=10).rvs
>>> S = sp.sparse.random_array((3, 4), density=0.25,
...                            rng=rng, data_sampler=rvs)
>>> S.toarray()
array([[ 36.,   0.,  33.,   0.],   # random
       [  0.,   0.,   0.,   0.],
       [  0.,   0.,  36.,   0.]])

Providing a sampler for uint values:

>>> def random_uint32_to_100(size=None):
...     return rng.integers(100, size=size, dtype=np.uint32)
>>> S = sp.sparse.random_array((3, 4), density=0.25, rng=rng,
...                            data_sampler=random_uint32_to_100)

Building a custom distribution. This example builds a squared normal from np.random:

>>> def np_normal_squared(size=None, rng=rng):
...     return rng.standard_normal(size) ** 2
>>> S = sp.sparse.random_array((3, 4), density=0.25, rng=rng,
...                            data_sampler=np_normal_squared)

Or we can build it from sp.stats style rvs functions:

>>> def sp_stats_normal_squared(size=None, rng=rng):
...     std_normal = sp.stats.distributions.norm_gen().rvs
...     return std_normal(size=size, random_state=rng) ** 2
>>> S = sp.sparse.random_array((3, 4), density=0.25, rng=rng,
...                            data_sampler=sp_stats_normal_squared)

Or we can subclass sp.stats rv_continuous or rv_discrete:

>>> class NormalSquared(sp.stats.rv_continuous):
...     def _rvs(self,  size=None, random_state=rng):
...         return rng.standard_normal(size) ** 2
>>> X = NormalSquared()
>>> Y = X().rvs
>>> S = sp.sparse.random_array((3, 4), density=0.25,
...                            rng=rng, data_sampler=Y)