# scipy.stats.qmc.LatinHypercube#

class scipy.stats.qmc.LatinHypercube(d, *, centered=False, strength=1, optimization=None, seed=None)[source]#

Latin hypercube sampling (LHS).

A Latin hypercube sample [1] generates $$n$$ points in $$[0,1)^{d}$$. Each univariate marginal distribution is stratified, placing exactly one point in $$[j/n, (j+1)/n)$$ for $$j=0,1,...,n-1$$. They are still applicable when $$n << d$$.

Parameters
dint

Dimension of the parameter space.

centeredbool, optional

Center the point within the multi-dimensional grid. Default is False.

optimization{None, “random-cd”}, optional

Whether to use an optimization scheme to construct a LHS. Default is None.

• random-cd: random permutations of coordinates to lower the centered discrepancy [5]. The best design based on the centered discrepancy is constantly updated. Centered discrepancy-based design shows better space filling robustness toward 2D and 3D subprojections compared to using other discrepancy measures [6].

New in version 1.8.0.

strength{1, 2}, optional

Strength of the LHS. strength=1 produces a plain LHS while strength=2 produces an orthogonal array based LHS of strength 2 [7], [8]. In that case, only n=p**2 points can be sampled, with p a prime number. It also constrains d <= p + 1. Default is 1.

New in version 1.8.0.

seed{None, int, numpy.random.Generator}, optional

If seed is None the numpy.random.Generator singleton is used. If seed is an int, a new Generator instance is used, seeded with seed. If seed is already a Generator instance then that instance is used.

Notes

When LHS is used for integrating a function $$f$$ over $$n$$, LHS is extremely effective on integrands that are nearly additive [2]. With a LHS of $$n$$ points, the variance of the integral is always lower than plain MC on $$n-1$$ points [3]. There is a central limit theorem for LHS on the mean and variance of the integral [4], but not necessarily for optimized LHS due to the randomization.

$$A$$ is called an orthogonal array of strength $$t$$ if in each n-row-by-t-column submatrix of $$A$$: all $$p^t$$ possible distinct rows occur the same number of times. The elements of $$A$$ are in the set $$\{0, 1, ..., p-1\}$$, also called symbols. The constraint that $$p$$ must be a prime number is to allow modular arithmetic.

Strength 1 (plain LHS) brings an advantage over strength 0 (MC) and strength 2 is a useful increment over strength 1. Going to strength 3 is a smaller increment and scrambled QMC like Sobol’, Halton are more performant [7].

To create a LHS of strength 2, the orthogonal array $$A$$ is randomized by applying a random, bijective map of the set of symbols onto itself. For example, in column 0, all 0s might become 2; in column 1, all 0s might become 1, etc. Then, for each column $$i$$ and symbol $$j$$, we add a plain, one-dimensional LHS of size $$p$$ to the subarray where $$A^i = j$$. The resulting matrix is finally divided by $$p$$.

References

1

Mckay et al., “A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code.” Technometrics, 1979.

2

M. Stein, “Large sample properties of simulations using Latin hypercube sampling.” Technometrics 29, no. 2: 143-151, 1987.

3

A. B. Owen, “Monte Carlo variance of scrambled net quadrature.” SIAM Journal on Numerical Analysis 34, no. 5: 1884-1910, 1997

4

Loh, W.-L. “On Latin hypercube sampling.” The annals of statistics 24, no. 5: 2058-2080, 1996.

5

Fang et al. “Design and modeling for computer experiments”. Computer Science and Data Analysis Series, 2006.

6

Damblin et al., “Numerical studies of space filling designs: optimization of Latin Hypercube Samples and subprojection properties.” Journal of Simulation, 2013.

7(1,2)

A. B. Owen , “Orthogonal arrays for computer experiments, integration and visualization.” Statistica Sinica, 1992.

8

B. Tang, “Orthogonal Array-Based Latin Hypercubes.” Journal of the American Statistical Association, 1993.

Examples

Generate samples from a Latin hypercube generator.

>>> from scipy.stats import qmc
>>> sampler = qmc.LatinHypercube(d=2)
>>> sample = sampler.random(n=5)
>>> sample
array([[0.1545328 , 0.53664833],  # random
[0.84052691, 0.06474907],
[0.52177809, 0.93343721],
[0.68033825, 0.36265316],
[0.26544879, 0.61163943]])


Compute the quality of the sample using the discrepancy criterion.

>>> qmc.discrepancy(sample)
0.0196...  # random


Samples can be scaled to bounds.

>>> l_bounds = [0, 2]
>>> u_bounds = [10, 5]
>>> qmc.scale(sample, l_bounds, u_bounds)
array([[1.54532796, 3.609945  ],  # random
[8.40526909, 2.1942472 ],
[5.2177809 , 4.80031164],
[6.80338249, 3.08795949],
[2.65448791, 3.83491828]])


Use the optimization keyword argument to produce a LHS with lower discrepancy at higher computational cost.

>>> sampler = qmc.LatinHypercube(d=2, optimization="random-cd")
>>> sample = sampler.random(n=5)
>>> qmc.discrepancy(sample)
0.0176...  # random


Use the strength keyword argument to produce an orthogonal array based LHS of strength 2. In this case, the number of sample points must be the square of a prime number.

>>> sampler = qmc.LatinHypercube(d=2, strength=2)
>>> sample = sampler.random(n=9)
>>> qmc.discrepancy(sample)
0.00526...  # random


Options could be combined to produce an optimized centered orthogonal array based LHS. After optimization, the result would not be guaranteed to be of strength 2.

Methods

 Fast-forward the sequence by n positions. integers(l_bounds, *[, u_bounds, n, ...]) Draw n integers from l_bounds (inclusive) to u_bounds (exclusive), or if endpoint=True, l_bounds (inclusive) to u_bounds (inclusive). random([n]) Draw n in the half-open interval [0, 1). Reset the engine to base state.