R和Python中的Wilcoxon测试之间的区别答案

【问题标题】：difference between Wilcoxon test in R and PythonR和Python中的Wilcoxon测试之间的区别
【发布时间】：2015-11-07 06:16:21
【问题描述】：

我正在尝试在 R 和 python 的 scipy.stats 包中运行 Wilcoxon 测试。但是我得到了不同的结果，谁能解释一下？

我在 R 中的代码

    > des2
 [1]  6.2151308  4.7956451  4.7473738  5.4695828  6.3181463  2.8617239
 [7] -0.8105824  3.9456856  4.6735000  4.1067193  5.7656002  2.2237666
[13]  1.0354143  4.9547707  5.3156348  4.8163154  3.4024776  4.2876854
[19]  6.1227500
> wilcox.test(des2, mu=0, conf.int = T)

    Wilcoxon signed rank test

data:  des2
V = 189, p-value = 7.629e-06
alternative hypothesis: true location is not equal to 0
95 percent confidence interval:
 3.485570 5.160925
sample estimates:
(pseudo)median 
      4.504883

我的 Python 代码：

test = [6.2151308, 4.7956451,  4.7473738,  5.4695828,  6.3181463,  2.8617239, -0.8105824, 3.9456856,  4.6735000,  4.1067193, 5.7656002, 2.2237666, 1.0354143, 4.9547707, 5.3156348,  4.8163154,  3.4024776,  4.2876854,  6.1227500]
z_statistic, p_value = wilcoxon(np.array(test) - np.log(1.0))
print "one-sample wilcoxon-test", p_value


one-sample wilcoxon-test 0.000155095772796

尽管他们两个的 p 值都低到足以拒绝原假设，但 p 值有 3 个数量级的差异，我不明白为什么

【问题讨论】：

scipy 的文档告诉我们：Because the normal approximation is used for the calculations, the samples used should be large - wilcox.test 的文档说：By default (if exact is not specified), an exact p-value is computed if the samples contain less than 50 finite values and there are no ties. Otherwise, a normal approximation is used. 不过，不确定这是否是唯一的区别。
我正在使用 Wilcox，因为我不想要一个正常的近似值......因此我应该使用 R 版本对吗？
R 的检验不近似 p 值。这对于小样本量至关重要。
@cel 您的评论可能就是答案。 scipy 没有确切的测试（但在 github 上的拉取请求中有一些工作）。
@WarrenWeckesser，你知道票号吗？在这里有一个链接会很棒。

标签： python r scipy

【解决方案1】：

scipy 的实现在计算 p 值时总是使用正态近似值。虽然这确实适用于大样本n，但对于小样本，p 值可能会偏离真实的 p 值。

在scipy 的docs 的注释中你会发现：

因为计算使用的是正态近似值，所以使用的样本应该很大。一个典型的规则是要求 n > 20.

R 的实现会为小样本量计算精确的 p 值，并且仅对足够大的n 使用正态近似值。

R的docs告诉你：

默认情况下（如果未指定精确），则计算精确的 p 值如果样本包含少于 50 个有限值并且没有联系。否则，使用正态近似。

简而言之：当两个 p 值不同时，应该首选R 的 p 值。

【讨论】：