Sobol#
- class scipy.stats.qmc.Sobol(d, *, scramble=True, bits=None, seed=None, optimization=None)[source]#
Engine for generating (scrambled) Sobol’ sequences.
Sobol’ sequences are low-discrepancy, quasi-random numbers. Points can be drawn using two methods:
random_base2
: safely draw \(n=2^m\) points. This method guarantees the balance properties of the sequence.random
: draw an arbitrary number of points from the sequence. See warning below.
- Parameters:
- dint
Dimensionality of the sequence. Max dimensionality is 21201.
- scramblebool, optional
If True, use LMS+shift scrambling. Otherwise, no scrambling is done. Default is True.
- bitsint, optional
Number of bits of the generator. Control the maximum number of points that can be generated, which is
2**bits
. Maximal value is 64. It does not correspond to the return type, which is alwaysnp.float64
to prevent points from repeating themselves. Default is None, which for backward compatibility, corresponds to 30.Added in version 1.9.0.
- optimization{None, “random-cd”, “lloyd”}, optional
Whether to use an optimization scheme to improve the quality after sampling. Note that this is a post-processing step that does not guarantee that all properties of the sample will be conserved. Default is None.
random-cd
: random permutations of coordinates to lower the centered discrepancy. The best sample based on the centered discrepancy is constantly updated. Centered discrepancy-based sampling shows better space-filling robustness toward 2D and 3D subprojections compared to using other discrepancy measures.lloyd
: Perturb samples using a modified Lloyd-Max algorithm. The process converges to equally spaced samples.
Added in version 1.10.0.
- seed{None, int,
numpy.random.Generator
}, optional If seed is an int or None, a new
numpy.random.Generator
is created usingnp.random.default_rng(seed)
. If seed is already aGenerator
instance, then the provided instance is used.
Notes
Sobol’ sequences [1] provide \(n=2^m\) low discrepancy points in \([0,1)^{d}\). Scrambling them [3] makes them suitable for singular integrands, provides a means of error estimation, and can improve their rate of convergence. The scrambling strategy which is implemented is a (left) linear matrix scramble (LMS) followed by a digital random shift (LMS+shift) [2].
There are many versions of Sobol’ sequences depending on their ‘direction numbers’. This code uses direction numbers from [4]. Hence, the maximum number of dimension is 21201. The direction numbers have been precomputed with search criterion 6 and can be retrieved at https://web.maths.unsw.edu.au/~fkuo/sobol/.
Warning
Sobol’ sequences are a quadrature rule and they lose their balance properties if one uses a sample size that is not a power of 2, or skips the first point, or thins the sequence [5].
If \(n=2^m\) points are not enough then one should take \(2^M\) points for \(M>m\). When scrambling, the number R of independent replicates does not have to be a power of 2.
Sobol’ sequences are generated to some number \(B\) of bits. After \(2^B\) points have been generated, the sequence would repeat. Hence, an error is raised. The number of bits can be controlled with the parameter bits.
References
[1]I. M. Sobol’, “The distribution of points in a cube and the accurate evaluation of integrals.” Zh. Vychisl. Mat. i Mat. Phys., 7:784-802, 1967.
[2]J. Matousek, “On the L2-discrepancy for anchored boxes.” J. of Complexity 14, 527-556, 1998.
[3]Art B. Owen, “Scrambling Sobol and Niederreiter-Xing points.” Journal of Complexity, 14(4):466-489, December 1998.
[4]S. Joe and F. Y. Kuo, “Constructing sobol sequences with better two-dimensional projections.” SIAM Journal on Scientific Computing, 30(5):2635-2654, 2008.
[5]Art B. Owen, “On dropping the first Sobol’ point.” arXiv:2008.08051, 2020.
Examples
Generate samples from a low discrepancy sequence of Sobol’.
>>> from scipy.stats import qmc >>> sampler = qmc.Sobol(d=2, scramble=False) >>> sample = sampler.random_base2(m=3) >>> sample array([[0. , 0. ], [0.5 , 0.5 ], [0.75 , 0.25 ], [0.25 , 0.75 ], [0.375, 0.375], [0.875, 0.875], [0.625, 0.125], [0.125, 0.625]])
Compute the quality of the sample using the discrepancy criterion.
>>> qmc.discrepancy(sample) 0.013882107204860938
To continue an existing design, extra points can be obtained by calling again
random_base2
. Alternatively, you can skip some points like:>>> _ = sampler.reset() >>> _ = sampler.fast_forward(4) >>> sample_continued = sampler.random_base2(m=2) >>> sample_continued array([[0.375, 0.375], [0.875, 0.875], [0.625, 0.125], [0.125, 0.625]])
Finally, samples can be scaled to bounds.
>>> l_bounds = [0, 2] >>> u_bounds = [10, 5] >>> qmc.scale(sample_continued, l_bounds, u_bounds) array([[3.75 , 3.125], [8.75 , 4.625], [6.25 , 2.375], [1.25 , 3.875]])
Methods
fast_forward
(n)Fast-forward the sequence by n positions.
integers
(l_bounds, *[, u_bounds, n, ...])Draw n integers from l_bounds (inclusive) to u_bounds (exclusive), or if endpoint=True, l_bounds (inclusive) to u_bounds (inclusive).
random
([n, workers])Draw n in the half-open interval
[0, 1)
.random_base2
(m)Draw point(s) from the Sobol' sequence.
reset
()Reset the engine to base state.