scipy.stats.kendalltau¶

scipy.stats.
kendalltau
(x, y, initial_lexsort=None, nan_policy='propagate', method='auto', variant='b')[source]¶ Calculate Kendall’s tau, a correlation measure for ordinal data.
Kendall’s tau is a measure of the correspondence between two rankings. Values close to 1 indicate strong agreement, and values close to 1 indicate strong disagreement. This implements two variants of Kendall’s tau: taub (the default) and tauc (also known as Stuart’s tauc). These differ only in how they are normalized to lie within the range 1 to 1; the hypothesis tests (their pvalues) are identical. Kendall’s original taua is not implemented separately because both taub and tauc reduce to taua in the absence of ties.
 Parameters
 x, yarray_like
Arrays of rankings, of the same shape. If arrays are not 1D, they will be flattened to 1D.
 initial_lexsortbool, optional
Unused (deprecated).
 nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
 method{‘auto’, ‘asymptotic’, ‘exact’}, optional
Defines which method is used to calculate the pvalue [5]. The following options are available (default is ‘auto’):
‘auto’: selects the appropriate method based on a tradeoff between speed and accuracy
‘asymptotic’: uses a normal approximation valid for large samples
‘exact’: computes the exact pvalue, but is only available when no ties are present
 variant: {‘b’, ‘c’}, optional
Defines which variant of Kendall’s tau is returned. Default is ‘b’.
 Returns
 correlationfloat
The tau statistic.
 pvaluefloat
The twosided pvalue for a hypothesis test whose null hypothesis is an absence of association, tau = 0.
See also
spearmanr
Calculates a Spearman rankorder correlation coefficient.
theilslopes
Computes the TheilSen estimator for a set of points (x, y).
weightedtau
Computes a weighted version of Kendall’s tau.
Notes
The definition of Kendall’s tau that is used is [2]:
tau_b = (P  Q) / sqrt((P + Q + T) * (P + Q + U)) tau_c = 2 (P  Q) / (n**2 * (m  1) / m)
where P is the number of concordant pairs, Q the number of discordant pairs, T the number of ties only in x, and U the number of ties only in y. If a tie occurs for the same pair in both x and y, it is not added to either T or U. n is the total number of samples, and m is the number of unique values in either x or y, whichever is smaller.
References
 1
Maurice G. Kendall, “A New Measure of Rank Correlation”, Biometrika Vol. 30, No. 1/2, pp. 8193, 1938.
 2
Maurice G. Kendall, “The treatment of ties in ranking problems”, Biometrika Vol. 33, No. 3, pp. 239251. 1945.
 3
Gottfried E. Noether, “Elements of Nonparametric Statistics”, John Wiley & Sons, 1967.
 4
Peter M. Fenwick, “A new data structure for cumulative frequency tables”, Software: Practice and Experience, Vol. 24, No. 3, pp. 327336, 1994.
 5
Maurice G. Kendall, “Rank Correlation Methods” (4th Edition), Charles Griffin & Co., 1970.
Examples
>>> from scipy import stats >>> x1 = [12, 2, 1, 12, 2] >>> x2 = [1, 4, 7, 1, 0] >>> tau, p_value = stats.kendalltau(x1, x2) >>> tau 0.47140452079103173 >>> p_value 0.2827454599327748