# Statistical functions (scipy.stats)¶

This module contains a large number of probability distributions as well as a growing library of statistical functions.

Each univariate distribution is an instance of a subclass of rv_continuous (rv_discrete for discrete distributions):

 rv_continuous([momtype, a, b, xtol, …]) A generic continuous random variable class meant for subclassing. rv_discrete([a, b, name, badvalue, …]) A generic discrete random variable class meant for subclassing. rv_histogram(histogram, *args, **kwargs) Generates a distribution given by a histogram.

## Continuous distributions¶

 alpha(\*args, \*\*kwds) An alpha continuous random variable. anglit(\*args, \*\*kwds) An anglit continuous random variable. arcsine(\*args, \*\*kwds) An arcsine continuous random variable. argus(\*args, \*\*kwds) Argus distribution beta(\*args, \*\*kwds) A beta continuous random variable. betaprime(\*args, \*\*kwds) A beta prime continuous random variable. bradford(\*args, \*\*kwds) A Bradford continuous random variable. burr(\*args, \*\*kwds) A Burr (Type III) continuous random variable. burr12(\*args, \*\*kwds) A Burr (Type XII) continuous random variable. cauchy(\*args, \*\*kwds) A Cauchy continuous random variable. chi(\*args, \*\*kwds) A chi continuous random variable. chi2(\*args, \*\*kwds) A chi-squared continuous random variable. cosine(\*args, \*\*kwds) A cosine continuous random variable. crystalball(\*args, \*\*kwds) Crystalball distribution dgamma(\*args, \*\*kwds) A double gamma continuous random variable. dweibull(\*args, \*\*kwds) A double Weibull continuous random variable. erlang(\*args, \*\*kwds) An Erlang continuous random variable. expon(\*args, \*\*kwds) An exponential continuous random variable. exponnorm(\*args, \*\*kwds) An exponentially modified Normal continuous random variable. exponweib(\*args, \*\*kwds) An exponentiated Weibull continuous random variable. exponpow(\*args, \*\*kwds) An exponential power continuous random variable. f(\*args, \*\*kwds) An F continuous random variable. fatiguelife(\*args, \*\*kwds) A fatigue-life (Birnbaum-Saunders) continuous random variable. fisk(\*args, \*\*kwds) A Fisk continuous random variable. foldcauchy(\*args, \*\*kwds) A folded Cauchy continuous random variable. foldnorm(\*args, \*\*kwds) A folded normal continuous random variable. frechet_r(\*args, \*\*kwds) A Frechet right (or Weibull minimum) continuous random variable. frechet_l(\*args, \*\*kwds) A Frechet left (or Weibull maximum) continuous random variable. genlogistic(\*args, \*\*kwds) A generalized logistic continuous random variable. gennorm(\*args, \*\*kwds) A generalized normal continuous random variable. genpareto(\*args, \*\*kwds) A generalized Pareto continuous random variable. genexpon(\*args, \*\*kwds) A generalized exponential continuous random variable. genextreme(\*args, \*\*kwds) A generalized extreme value continuous random variable. gausshyper(\*args, \*\*kwds) A Gauss hypergeometric continuous random variable. gamma(\*args, \*\*kwds) A gamma continuous random variable. gengamma(\*args, \*\*kwds) A generalized gamma continuous random variable. genhalflogistic(\*args, \*\*kwds) A generalized half-logistic continuous random variable. geninvgauss(\*args, \*\*kwds) A Generalized Inverse Gaussian continuous random variable. gilbrat(\*args, \*\*kwds) A Gilbrat continuous random variable. gompertz(\*args, \*\*kwds) A Gompertz (or truncated Gumbel) continuous random variable. gumbel_r(\*args, \*\*kwds) A right-skewed Gumbel continuous random variable. gumbel_l(\*args, \*\*kwds) A left-skewed Gumbel continuous random variable. halfcauchy(\*args, \*\*kwds) A Half-Cauchy continuous random variable. halflogistic(\*args, \*\*kwds) A half-logistic continuous random variable. halfnorm(\*args, \*\*kwds) A half-normal continuous random variable. halfgennorm(\*args, \*\*kwds) The upper half of a generalized normal continuous random variable. hypsecant(\*args, \*\*kwds) A hyperbolic secant continuous random variable. invgamma(\*args, \*\*kwds) An inverted gamma continuous random variable. invgauss(\*args, \*\*kwds) An inverse Gaussian continuous random variable. invweibull(\*args, \*\*kwds) An inverted Weibull continuous random variable. johnsonsb(\*args, \*\*kwds) A Johnson SB continuous random variable. johnsonsu(\*args, \*\*kwds) A Johnson SU continuous random variable. kappa4(\*args, \*\*kwds) Kappa 4 parameter distribution. kappa3(\*args, \*\*kwds) Kappa 3 parameter distribution. ksone(\*args, \*\*kwds) General Kolmogorov-Smirnov one-sided test. kstwobign(\*args, \*\*kwds) Kolmogorov-Smirnov two-sided test for large N. laplace(\*args, \*\*kwds) A Laplace continuous random variable. levy(\*args, \*\*kwds) A Levy continuous random variable. levy_l(\*args, \*\*kwds) A left-skewed Levy continuous random variable. levy_stable(\*args, \*\*kwds) A Levy-stable continuous random variable. logistic(\*args, \*\*kwds) A logistic (or Sech-squared) continuous random variable. loggamma(\*args, \*\*kwds) A log gamma continuous random variable. loglaplace(\*args, \*\*kwds) A log-Laplace continuous random variable. lognorm(\*args, \*\*kwds) A lognormal continuous random variable. loguniform(\*args, \*\*kwds) A loguniform or reciprocal continuous random variable. lomax(\*args, \*\*kwds) A Lomax (Pareto of the second kind) continuous random variable. maxwell(\*args, \*\*kwds) A Maxwell continuous random variable. mielke(\*args, \*\*kwds) A Mielke Beta-Kappa / Dagum continuous random variable. moyal(\*args, \*\*kwds) A Moyal continuous random variable. nakagami(\*args, \*\*kwds) A Nakagami continuous random variable. ncx2(\*args, \*\*kwds) A non-central chi-squared continuous random variable. ncf(\*args, \*\*kwds) A non-central F distribution continuous random variable. nct(\*args, \*\*kwds) A non-central Student’s t continuous random variable. norm(\*args, \*\*kwds) A normal continuous random variable. norminvgauss(\*args, \*\*kwds) A Normal Inverse Gaussian continuous random variable. pareto(\*args, \*\*kwds) A Pareto continuous random variable. pearson3(\*args, \*\*kwds) A pearson type III continuous random variable. powerlaw(\*args, \*\*kwds) A power-function continuous random variable. powerlognorm(\*args, \*\*kwds) A power log-normal continuous random variable. powernorm(\*args, \*\*kwds) A power normal continuous random variable. rdist(\*args, \*\*kwds) An R-distributed (symmetric beta) continuous random variable. rayleigh(\*args, \*\*kwds) A Rayleigh continuous random variable. rice(\*args, \*\*kwds) A Rice continuous random variable. recipinvgauss(\*args, \*\*kwds) A reciprocal inverse Gaussian continuous random variable. semicircular(\*args, \*\*kwds) A semicircular continuous random variable. skewnorm(\*args, \*\*kwds) A skew-normal random variable. t(\*args, \*\*kwds) A Student’s t continuous random variable. trapz(\*args, \*\*kwds) A trapezoidal continuous random variable. triang(\*args, \*\*kwds) A triangular continuous random variable. truncexpon(\*args, \*\*kwds) A truncated exponential continuous random variable. truncnorm(\*args, \*\*kwds) A truncated normal continuous random variable. tukeylambda(\*args, \*\*kwds) A Tukey-Lamdba continuous random variable. uniform(\*args, \*\*kwds) A uniform continuous random variable. vonmises(\*args, \*\*kwds) A Von Mises continuous random variable. vonmises_line(\*args, \*\*kwds) A Von Mises continuous random variable. wald(\*args, \*\*kwds) A Wald continuous random variable. weibull_min(\*args, \*\*kwds) Weibull minimum continuous random variable. weibull_max(\*args, \*\*kwds) Weibull maximum continuous random variable. wrapcauchy(\*args, \*\*kwds) A wrapped Cauchy continuous random variable.

## Multivariate distributions¶

 multivariate_normal([mean, cov, …]) A multivariate normal random variable. matrix_normal([mean, rowcov, colcov, seed]) A matrix normal random variable. dirichlet(alpha[, seed]) A Dirichlet random variable. wishart([df, scale, seed]) A Wishart random variable. invwishart([df, scale, seed]) An inverse Wishart random variable. multinomial(n, p[, seed]) A multinomial random variable. special_ortho_group([dim, seed]) A matrix-valued SO(N) random variable. ortho_group A matrix-valued O(N) random variable. unitary_group A matrix-valued U(N) random variable. random_correlation A random correlation matrix.

## Discrete distributions¶

 bernoulli(\*args, \*\*kwds) A Bernoulli discrete random variable. betabinom(\*args, \*\*kwds) A beta-binomial discrete random variable. binom(\*args, \*\*kwds) A binomial discrete random variable. boltzmann(\*args, \*\*kwds) A Boltzmann (Truncated Discrete Exponential) random variable. dlaplace(\*args, \*\*kwds) A Laplacian discrete random variable. geom(\*args, \*\*kwds) A geometric discrete random variable. hypergeom(\*args, \*\*kwds) A hypergeometric discrete random variable. logser(\*args, \*\*kwds) A Logarithmic (Log-Series, Series) discrete random variable. nbinom(\*args, \*\*kwds) A negative binomial discrete random variable. planck(\*args, \*\*kwds) A Planck discrete exponential random variable. poisson(\*args, \*\*kwds) A Poisson discrete random variable. randint(\*args, \*\*kwds) A uniform discrete random variable. skellam(\*args, \*\*kwds) A Skellam discrete random variable. zipf(\*args, \*\*kwds) A Zipf discrete random variable. yulesimon(\*args, \*\*kwds) A Yule-Simon discrete random variable.

An overview of statistical functions is given below. Several of these functions have a similar version in scipy.stats.mstats which work for masked arrays.

## Summary statistics¶

 describe(a[, axis, ddof, bias, nan_policy]) Compute several descriptive statistics of the passed array. gmean(a[, axis, dtype]) Compute the geometric mean along the specified axis. hmean(a[, axis, dtype]) Calculate the harmonic mean along the specified axis. kurtosis(a[, axis, fisher, bias, nan_policy]) Compute the kurtosis (Fisher or Pearson) of a dataset. mode(a[, axis, nan_policy]) Return an array of the modal (most common) value in the passed array. moment(a[, moment, axis, nan_policy]) Calculate the nth moment about the mean for a sample. skew(a[, axis, bias, nan_policy]) Compute the sample skewness of a data set. kstat(data[, n]) Return the nth k-statistic (1<=n<=4 so far). kstatvar(data[, n]) Return an unbiased estimator of the variance of the k-statistic. tmean(a[, limits, inclusive, axis]) Compute the trimmed mean. tvar(a[, limits, inclusive, axis, ddof]) Compute the trimmed variance. tmin(a[, lowerlimit, axis, inclusive, …]) Compute the trimmed minimum. tmax(a[, upperlimit, axis, inclusive, …]) Compute the trimmed maximum. tstd(a[, limits, inclusive, axis, ddof]) Compute the trimmed sample standard deviation. tsem(a[, limits, inclusive, axis, ddof]) Compute the trimmed standard error of the mean. variation(a[, axis, nan_policy]) Compute the coefficient of variation. Find repeats and repeat counts. trim_mean(a, proportiontocut[, axis]) Return mean of array after trimming distribution from both tails. gstd(a[, axis, ddof]) Calculate the geometric standard deviation of an array. iqr(x[, axis, rng, scale, nan_policy, …]) Compute the interquartile range of the data along the specified axis. sem(a[, axis, ddof, nan_policy]) Compute standard error of the mean. bayes_mvs(data[, alpha]) Bayesian confidence intervals for the mean, var, and std. mvsdist(data) ‘Frozen’ distributions for mean, variance, and standard deviation of data. entropy(pk[, qk, base, axis]) Calculate the entropy of a distribution for given probability values. median_absolute_deviation(x[, axis, center, …]) Compute the median absolute deviation of the data along the given axis.

## Frequency statistics¶

 cumfreq(a[, numbins, defaultreallimits, weights]) Return a cumulative frequency histogram, using the histogram function. itemfreq(\*args, \*\*kwds) itemfreq is deprecated! itemfreq is deprecated and will be removed in a future version. percentileofscore(a, score[, kind]) Compute the percentile rank of a score relative to a list of scores. scoreatpercentile(a, per[, limit, …]) Calculate the score at a given percentile of the input sequence. relfreq(a[, numbins, defaultreallimits, weights]) Return a relative frequency histogram, using the histogram function.
 binned_statistic(x, values[, statistic, …]) Compute a binned statistic for one or more sets of data. binned_statistic_2d(x, y, values[, …]) Compute a bidimensional binned statistic for one or more sets of data. binned_statistic_dd(sample, values[, …]) Compute a multidimensional binned statistic for a set of data.

## Correlation functions¶

 f_oneway(\*args) Perform one-way ANOVA. pearsonr(x, y) Pearson correlation coefficient and p-value for testing non-correlation. spearmanr(a[, b, axis, nan_policy]) Calculate a Spearman correlation coefficient with associated p-value. pointbiserialr(x, y) Calculate a point biserial correlation coefficient and its p-value. kendalltau(x, y[, initial_lexsort, …]) Calculate Kendall’s tau, a correlation measure for ordinal data. weightedtau(x, y[, rank, weigher, additive]) Compute a weighted version of Kendall’s $$\tau$$. linregress(x[, y]) Calculate a linear least-squares regression for two sets of measurements. siegelslopes(y[, x, method]) Computes the Siegel estimator for a set of points (x, y). theilslopes(y[, x, alpha]) Computes the Theil-Sen estimator for a set of points (x, y). multiscale_graphcorr(x, y[, …]) Computes the Multiscale Graph Correlation (MGC) test statistic.

## Statistical tests¶

 ttest_1samp(a, popmean[, axis, nan_policy]) Calculate the T-test for the mean of ONE group of scores. ttest_ind(a, b[, axis, equal_var, nan_policy]) Calculate the T-test for the means of two independent samples of scores. ttest_ind_from_stats(mean1, std1, nobs1, …) T-test for means of two independent samples from descriptive statistics. ttest_rel(a, b[, axis, nan_policy]) Calculate the t-test on TWO RELATED samples of scores, a and b. kstest(rvs, cdf[, args, N, alternative, mode]) Perform the Kolmogorov-Smirnov test for goodness of fit. chisquare(f_obs[, f_exp, ddof, axis]) Calculate a one-way chi-square test. power_divergence(f_obs[, f_exp, ddof, axis, …]) Cressie-Read power divergence statistic and goodness of fit test. ks_2samp(data1, data2[, alternative, mode]) Compute the Kolmogorov-Smirnov statistic on 2 samples. epps_singleton_2samp(x, y[, t]) Compute the Epps-Singleton (ES) test statistic. mannwhitneyu(x, y[, use_continuity, alternative]) Compute the Mann-Whitney rank test on samples x and y. tiecorrect(rankvals) Tie correction factor for Mann-Whitney U and Kruskal-Wallis H tests. rankdata(a[, method]) Assign ranks to data, dealing with ties appropriately. ranksums(x, y) Compute the Wilcoxon rank-sum statistic for two samples. wilcoxon(x[, y, zero_method, correction, …]) Calculate the Wilcoxon signed-rank test. kruskal(\*args, \*\*kwargs) Compute the Kruskal-Wallis H-test for independent samples. friedmanchisquare(\*args) Compute the Friedman test for repeated measurements. brunnermunzel(x, y[, alternative, …]) Compute the Brunner-Munzel test on samples x and y. combine_pvalues(pvalues[, method, weights]) Combine p-values from independent tests bearing upon the same hypothesis. Perform the Jarque-Bera goodness of fit test on sample data.
 ansari(x, y) Perform the Ansari-Bradley test for equal scale parameters. bartlett(\*args) Perform Bartlett’s test for equal variances. levene(\*args, \*\*kwds) Perform Levene test for equal variances. Perform the Shapiro-Wilk test for normality. anderson(x[, dist]) Anderson-Darling test for data coming from a particular distribution. anderson_ksamp(samples[, midrank]) The Anderson-Darling test for k-samples. binom_test(x[, n, p, alternative]) Perform a test that the probability of success is p. fligner(\*args, \*\*kwds) Perform Fligner-Killeen test for equality of variance. median_test(\*args, \*\*kwds) Perform a Mood’s median test. mood(x, y[, axis]) Perform Mood’s test for equal scale parameters. skewtest(a[, axis, nan_policy]) Test whether the skew is different from the normal distribution. kurtosistest(a[, axis, nan_policy]) Test whether a dataset has normal kurtosis. normaltest(a[, axis, nan_policy]) Test whether a sample differs from a normal distribution.

## Transformations¶

 boxcox(x[, lmbda, alpha]) Return a dataset transformed by a Box-Cox power transformation. boxcox_normmax(x[, brack, method]) Compute optimal Box-Cox transform parameter for input data. boxcox_llf(lmb, data) The boxcox log-likelihood function. yeojohnson(x[, lmbda]) Return a dataset transformed by a Yeo-Johnson power transformation. yeojohnson_normmax(x[, brack]) Compute optimal Yeo-Johnson transform parameter. yeojohnson_llf(lmb, data) The yeojohnson log-likelihood function. obrientransform(\*args) Compute the O’Brien transform on input data (any number of arrays). sigmaclip(a[, low, high]) Perform iterative sigma-clipping of array elements. trimboth(a, proportiontocut[, axis]) Slice off a proportion of items from both ends of an array. trim1(a, proportiontocut[, tail, axis]) Slice off a proportion from ONE end of the passed array distribution. zmap(scores, compare[, axis, ddof]) Calculate the relative z-scores. zscore(a[, axis, ddof, nan_policy]) Compute the z score.

## Statistical distances¶

 wasserstein_distance(u_values, v_values[, …]) Compute the first Wasserstein distance between two 1D distributions. energy_distance(u_values, v_values[, …]) Compute the energy distance between two 1D distributions.

## Random variate generation¶

 rvs_ratio_uniforms(pdf, umax, vmin, vmax[, …]) Generate random samples from a probability density function using the ratio-of-uniforms method.

## Circular statistical functions¶

 circmean(samples[, high, low, axis, nan_policy]) Compute the circular mean for samples in a range. circvar(samples[, high, low, axis, nan_policy]) Compute the circular variance for samples assumed to be in a range. circstd(samples[, high, low, axis, nan_policy]) Compute the circular standard deviation for samples assumed to be in the range [low to high].

## Contingency table functions¶

 chi2_contingency(observed[, correction, lambda_]) Chi-square test of independence of variables in a contingency table. contingency.expected_freq(observed) Compute the expected frequencies from a contingency table. Return a list of the marginal sums of the array a. fisher_exact(table[, alternative]) Perform a Fisher exact test on a 2x2 contingency table.

## Plot-tests¶

 ppcc_max(x[, brack, dist]) Calculate the shape parameter that maximizes the PPCC. ppcc_plot(x, a, b[, dist, plot, N]) Calculate and optionally plot probability plot correlation coefficient. probplot(x[, sparams, dist, fit, plot, rvalue]) Calculate quantiles for a probability plot, and optionally show the plot. boxcox_normplot(x, la, lb[, plot, N]) Compute parameters for a Box-Cox normality plot, optionally show it. yeojohnson_normplot(x, la, lb[, plot, N]) Compute parameters for a Yeo-Johnson normality plot, optionally show it.

## Univariate and multivariate kernel density estimation¶

 gaussian_kde(dataset[, bw_method, weights]) Representation of a kernel-density estimate using Gaussian kernels.

## Warnings used in scipy.stats¶

 Warning generated by pearsonr when an input is constant. Warning generated by pearsonr when an input is nearly constant. Warning generated by spearmanr when an input is constant.

For many more stat related functions install the software R and the interface package rpy.