scipy.stats.kruskal¶

scipy.stats.
kruskal
(*args, nan_policy='propagate')[source]¶ Compute the KruskalWallis Htest for independent samples.
The KruskalWallis Htest tests the null hypothesis that the population median of all of the groups are equal. It is a nonparametric version of ANOVA. The test works on 2 or more independent samples, which may have different sizes. Note that rejecting the null hypothesis does not indicate which of the groups differs. Post hoc comparisons between groups are required to determine which groups are different.
 Parameters
 sample1, sample2, …array_like
Two or more arrays with the sample measurements can be given as arguments.
 nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
 Returns
 statisticfloat
The KruskalWallis H statistic, corrected for ties.
 pvaluefloat
The pvalue for the test using the assumption that H has a chi square distribution. The pvalue returned is the survival function of the chi square distribution evaluated at H.
See also
f_oneway
1way ANOVA.
mannwhitneyu
MannWhitney rank test on two samples.
friedmanchisquare
Friedman test for repeated measurements.
Notes
Due to the assumption that H has a chi square distribution, the number of samples in each group must not be too small. A typical rule is that each sample must have at least 5 measurements.
References
 1
W. H. Kruskal & W. W. Wallis, “Use of Ranks in OneCriterion Variance Analysis”, Journal of the American Statistical Association, Vol. 47, Issue 260, pp. 583621, 1952.
 2
https://en.wikipedia.org/wiki/KruskalWallis_oneway_analysis_of_variance
Examples
>>> from scipy import stats >>> x = [1, 3, 5, 7, 9] >>> y = [2, 4, 6, 8, 10] >>> stats.kruskal(x, y) KruskalResult(statistic=0.2727272727272734, pvalue=0.6015081344405895)
>>> x = [1, 1, 1] >>> y = [2, 2, 2] >>> z = [2, 2] >>> stats.kruskal(x, y, z) KruskalResult(statistic=7.0, pvalue=0.0301973834223185)