scipy.special.kolmogorov(y, out=None) = <ufunc 'kolmogorov'>#

Complementary cumulative distribution (Survival Function) function of Kolmogorov distribution.

Returns the complementary cumulative distribution function of Kolmogorov’s limiting distribution (D_n*\sqrt(n) as n goes to infinity) of a two-sided test for equality between an empirical and a theoretical distribution. It is equal to the (limit as n->infinity of the) probability that sqrt(n) * max absolute deviation > y.

yfloat array_like

Absolute deviation between the Empirical CDF (ECDF) and the target CDF, multiplied by sqrt(n).

outndarray, optional

Optional output array for the function results

scalar or ndarray

The value(s) of kolmogorov(y)

See also


The Inverse Survival Function for the distribution


Provides the functionality as a continuous distribution

smirnov, smirnovi

Functions for the one-sided distribution


kolmogorov is used by stats.kstest in the application of the Kolmogorov-Smirnov Goodness of Fit test. For historial reasons this function is exposed in scpy.special, but the recommended way to achieve the most accurate CDF/SF/PDF/PPF/ISF computations is to use the stats.kstwobign distribution.


Show the probability of a gap at least as big as 0, 0.5 and 1.0.

>>> import numpy as np
>>> from scipy.special import kolmogorov
>>> from scipy.stats import kstwobign
>>> kolmogorov([0, 0.5, 1.0])
array([ 1.        ,  0.96394524,  0.26999967])

Compare a sample of size 1000 drawn from a Laplace(0, 1) distribution against the target distribution, a Normal(0, 1) distribution.

>>> from scipy.stats import norm, laplace
>>> rng = np.random.default_rng()
>>> n = 1000
>>> lap01 = laplace(0, 1)
>>> x = np.sort(lap01.rvs(n, random_state=rng))
>>> np.mean(x), np.std(x)
(-0.05841730131499543, 1.3968109101997568)

Construct the Empirical CDF and the K-S statistic Dn.

>>> target = norm(0,1)  # Normal mean 0, stddev 1
>>> cdfs = target.cdf(x)
>>> ecdfs = np.arange(n+1, dtype=float)/n
>>> gaps = np.column_stack([cdfs - ecdfs[:n], ecdfs[1:] - cdfs])
>>> Dn = np.max(gaps)
>>> Kn = np.sqrt(n) * Dn
>>> print('Dn=%f, sqrt(n)*Dn=%f' % (Dn, Kn))
Dn=0.043363, sqrt(n)*Dn=1.371265
>>> print(chr(10).join(['For a sample of size n drawn from a N(0, 1) distribution:',
...   ' the approximate Kolmogorov probability that sqrt(n)*Dn>=%f is %f' %  (Kn, kolmogorov(Kn)),
...   ' the approximate Kolmogorov probability that sqrt(n)*Dn<=%f is %f' %  (Kn, kstwobign.cdf(Kn))]))
For a sample of size n drawn from a N(0, 1) distribution:
 the approximate Kolmogorov probability that sqrt(n)*Dn>=1.371265 is 0.046533
 the approximate Kolmogorov probability that sqrt(n)*Dn<=1.371265 is 0.953467

Plot the Empirical CDF against the target N(0, 1) CDF.

>>> import matplotlib.pyplot as plt
>>> plt.step(np.concatenate([[-3], x]), ecdfs, where='post', label='Empirical CDF')
>>> x3 = np.linspace(-3, 3, 100)
>>> plt.plot(x3, target.cdf(x3), label='CDF for N(0, 1)')
>>> plt.ylim([0, 1]); plt.grid(True); plt.legend();
>>> # Add vertical lines marking Dn+ and Dn-
>>> iminus, iplus = np.argmax(gaps, axis=0)
>>> plt.vlines([x[iminus]], ecdfs[iminus], cdfs[iminus], color='r', linestyle='dashed', lw=4)
>>> plt.vlines([x[iplus]], cdfs[iplus], ecdfs[iplus+1], color='r', linestyle='dashed', lw=4)