【发布时间】:2016-04-07 11:40:57
【问题描述】:
必须有一种 R-ly 方式来调用 wilcox.test 并使用 group_by 并行处理多个观察结果。我花了很多时间阅读这方面的内容,但仍然无法确定拨打wilcox.test 的电话可以完成这项工作。下面的示例数据和代码,使用magrittr 管道和summarize()。
library(dplyr)
library(magrittr)
# create a data frame where x is the dependent variable, id1 is a category variable (here with five levels), and id2 is a binary category variable used for the two-sample wilcoxon test
df <- data.frame(x=abs(rnorm(50)),id1=rep(1:5,10), id2=rep(1:2,25))
# make sure piping and grouping are called correctly, with "sum" function as a well-behaving example function
df %>% group_by(id1) %>% summarise(s=sum(x))
df %>% group_by(id1,id2) %>% summarise(s=sum(x))
# make sure wilcox.test is called correctly
wilcox.test(x~id2, data=df, paired=FALSE)$p.value
# yet, cannot call wilcox.test within pipe with summarise (regardless of group_by). Expected output is five p-values (one for each level of id1)
df %>% group_by(id1) %>% summarise(w=wilcox.test(x~id2, data=., paired=FALSE)$p.value)
df %>% summarise(wilcox.test(x~id2, data=., paired=FALSE))
# even specifying formula argument by name doesn't help
df %>% group_by(id1) %>% summarise(w=wilcox.test(formula=x~id2, data=., paired=FALSE)$p.value)
错误的调用产生了这个错误:
Error in wilcox.test.formula(c(1.09057358373486,
2.28465932554436, 0.885617572657959, : 'formula' missing or incorrect
感谢您的帮助;我希望它对其他有类似问题的人也有帮助。
【问题讨论】:
-
其他答案更完整,但只是为了列出所有可能的解决方案:
df %>% group_by(id1) %>% summarise(w=wilcox.test(x[id2==1], x[id2==2], paired=FALSE)$p.value)
标签: r dplyr magrittr group-summaries