Transformed Density Rejection (TDR)#

  • Required: T-concave PDF, dPDF

  • Optional: mode, center

  • Speed:

    • Set-up: slow

    • Sampling: fast

TDR is an acceptance/rejection method that uses the concavity of a transformed density to construct hat function and squeezes automatically. Such PDFs are called T-concave. Currently the following transformations are implemented:

\[\begin{split}c = 0.: T(x) &= \log(x)\\ c = -0.5: T(x) &= \frac{1}{\sqrt{x}} \text{ (Default)}\end{split}\]

In addition to the PDF, it also requires the derivative of the PDF w.r.t x (i.e. the variate). These functions must be present as methods of a python object which can then be passed to the generators to instantiate their object. The variant that is implemented uses squeezes proportional to hat function ([1]).

An example of using this method is shown below:

>>> from scipy.stats.sampling import TransformedDensityRejection
>>> from scipy.stats import norm
>>>
>>> class StandardNormal:
...     def pdf(self, x):
...         # note that the normalization constant is not required
...         return np.exp(-0.5 * x*x)
...     def dpdf(self, x):
...         return -x * np.exp(-0.5 * x*x)
...
>>> dist = StandardNormal()
>>>
>>> urng = np.random.default_rng()
>>> rng = TransformedDensityRejection(dist, random_state=urng)
>>> rng.rvs()
-1.526829048388144

In the above example, we have used the TDR method to sample from the standard normal distribution. Note that we can drop the normalization constant while computing the PDF. This usually helps speed up the sampling stage. Also, note that the PDF doesn’t need to be vectorized. It should accept and return a scalar.

It is also possible to evaluate the inverse of the CDF of the hat distribution directly using the ppf_hat method.

>>> rng.ppf_hat(0.5)
-0.00018050266342362759
>>> norm.ppf(0.5)
0.0
>>> u = np.linspace(0, 1, num=10)
>>> rng.ppf_hat(u)
array([       -inf, -1.22227372, -0.7656556 , -0.43135731, -0.14002921,
        0.13966423,  0.43096141,  0.76517113,  1.22185606,         inf])
>>> norm.ppf(u)
array([       -inf, -1.22064035, -0.76470967, -0.4307273 , -0.1397103 ,
        0.1397103 ,  0.4307273 ,  0.76470967,  1.22064035,         inf])

Apart from the PPF method, other attributes can be accessed to see how well the generator fits the given distribution. These are:

  • ‘squeeze_hat_ratio’: (area below squeeze) / (area below hat) for the generator. It is a number between 0 and 1. Closer to 1 means that the hat and the squeeze functions tightly envelop the distribution and fewer PDF evaluations are required to generate samples. The expected number of evaluations of the density is bounded by (1/squeeze_hat_ratio) - 1 per sample. By default, it is kept above 0.99 but that can be changed by passing a max_squeeze_hat_ratio parameter.

  • ‘hat_area’: area below the hat for the generator.

  • ‘squeeze_area’: area below the squeeze for the generator.

    >>> rng.squeeze_hat_ratio
    0.9947024204884917
    >>> rng.hat_area
    2.510253139791547
    >>> rng.squeeze_area
    2.4969548741894876
    >>> rng.squeeze_hat_ratio == rng.squeeze_area / rng.hat_area
    True
    

The distribution can be truncated by passing a domain parameter:

>>> urng = np.random.default_rng()
>>> rng = TransformedDensityRejection(dist, domain=[0, 1], random_state=urng)
>>> rng.rvs(10)
array([0.05452512, 0.97251362, 0.49955877, 0.82789729, 0.33048885,
       0.55558548, 0.23168323, 0.13423275, 0.73176575, 0.35739799])

If the domain is not specified, the support method of the dist object is used to determine the domain:

>>> class StandardNormal:
...     def pdf(self, x):
...         return np.exp(-0.5 * x*x)
...     def dpdf(self, x):
...         return -x * np.exp(-0.5 * x*x)
...     def support(self):
...         return -np.inf, np.inf
...
>>> dist = StandardNormal()
>>>
>>> urng = np.random.default_rng()
>>> rng = TransformedDensityRejection(dist, random_state=urng)
>>> rng.rvs(10)
array([-1.52682905,  2.06206883,  0.15205036,  1.11587367, -0.30775562,
       0.29879802, -0.61858268, -1.01049115,  0.78853694, -0.23060766])

If the dist object does not provide a support method, the domain is assumed to be (-np.inf, np.inf).

To increase squeeze_hat_ratio, pass max_squeeze_hat_ratio:

>>> dist = StandardNormal()
>>> rng = TransformedDensityRejection(dist, max_squeeze_hat_ratio=0.999,
...                                   random_state=urng)
>>> rng.squeeze_hat_ratio
0.999364900465214

Let’s see how this affects the callbacks to the PDF method of the distribution:

>>> from copy import copy
>>> class StandardNormal:
...     def __init__(self):
...         self.callbacks = 0
...     def pdf(self, x):
...         self.callbacks += 1
...         return np.exp(-0.5 * x*x)
...     def dpdf(self, x):
...         return -x * np.exp(-0.5 * x*x)
...
>>> dist1 = StandardNormal()
>>> urng1 = np.random.default_rng()
>>> urng2 = copy(urng1)
>>> rng1 = TransformedDensityRejection(dist1, random_state=urng1)
>>> dist1.callbacks  # evaluations during setup
139
>>> dist1.callbacks = 0  # don't consider evaluations during setup
>>> rvs = rng1.rvs(100000)
>>> dist1.callbacks  # evaluations during sampling
527
>>> dist2 = StandardNormal()
>>> # use the same stream of uniform random numbers
>>> rng2 = TransformedDensityRejection(dist2, max_squeeze_hat_ratio=0.999,
...                                    random_state=urng2)
>>> dist2.callbacks  # evaluations during setup
467
>>> dist2.callbacks = 0  # don't consider evaluations during setup
>>> rvs = rng2.rvs(100000)
>>> dist2.callbacks  # evaluations during sampling
84

As we can see, far fewer PDF evaluations are required during sampling when we increase the squeeze_hat_ratio. The PPF-hat function is also more accurate:

>>> abs(norm.ppf(0.975) - rng1.ppf_hat(0.975))
0.0027054565421578136
>>> abs(norm.ppf(0.975) - rng2.ppf_hat(0.975))
0.00047824084476300044

Though, notice that this comes at the cost of increased PDF evaluations during setup.

For densities with modes not close to 0, it is suggested to set either the mode or the center of the distribution by passing mode or center parameters. The latter is the approximate location of the mode or the mean of the distribution. This location provides some information about the main part of the PDF and is used to avoid numerical problems.

>>> # mode = 0 for our distribution
>>> # if exact mode is not available, pass 'center' parameter instead
>>> rng = TransformedDensityRejection(dist, mode=0.)

By default, the method uses 30 construction points to construct the hat. This can be changed by passing a construction_points parameter which can either be an array of construction points or an integer representing the number of construction points to use.

>>> rng = TransformedDensityRejection(dist,
...                                   construction_points=[-5, 0, 5])

This method accepts many other set-up parameters. See the documentation for an exclusive list. More information of the parameters and the method can be found in Section 5.3.16 of the UNU.RAN user manual.

Please see [1] and [2] for more details on this method.

References#