使用函数 (R) 改进 Fisher 精确的代码答案

【问题标题】：Using a function (R) to improve code for Fisher's exact使用函数 (R) 改进 Fisher 精确的代码
【发布时间】：2021-02-07 10:24:17
【问题描述】：

我正在努力使我的代码更优雅。我想针对 3 个不同的标准（基本上是 3 个测试）对 2 个组进行 Fisher 精确检验。我有一个解决方案，但它很麻烦。我想知道是否有办法编写一个函数来实现相同的...

我的解决方案：

df <- data.frame(group = c("A", "B", "A", "B", "A", "B"),
+                  criteria = c("fever", "fever", "headache", "headache", "chills", "chills"),
+                  absent = c(35, 31, 78, 163, 53, 33),
+                  present = c(62, 154, 19, 22, 44, 152))

现在进行费舍尔检验，比较 A 组和 B 组的发烧、头痛和寒战。

#Compare A & B on fever
fever <- df %>% filter(criteria=="fever") %>% select(-criteria)
fever <- column_to_rownames(fever, var = "group")
fisher.test(fever)

#Compare A & B on headache
headache <- df %>% filter(criteria=="headache") %>% select(-criteria)
headache <- column_to_rownames(headache, var = "group")
fisher.test(headache)

#Compare A & B on chills
chills <- df %>% filter(criteria=="chills") %>% select(-criteria)
chills <- column_to_rownames(chills, var = "group")
fisher.test(chills)

我希望能够打印所有不同标准的费舍尔测试（请记住，实际上我有超过 3 个），而不必单独输入内容。我想这可以通过一个函数来实现，但我真的不知道从哪里开始......

非常感谢您的帮助...请放轻松，我是临床医生而不是信息学家！

【问题讨论】：

标签： r

【解决方案1】：

您可以group_by criteria 并将fisher.test 应用于每个组。

library(dplyr)

df %>%
  select(-group) %>%
  group_by(criteria) %>%
  summarise(fisher_test = list(fisher.test(cur_data()))) -> result

result$fisher_test

#[[1]]

#   Fisher's Exact Test for Count Data

#data:  cur_data()
#p-value = 0.0000000004
#alternative hypothesis: true odds ratio is not equal to 1
#95 percent confidence interval:
# 3.09 9.97
#sample estimates:
#odds ratio 
#      5.51 


#[[2]]

#   Fisher's Exact Test for Count Data

#data:  cur_data()
#p-value = 0.0004
#alternative hypothesis: true odds ratio is not equal to 1
#95 percent confidence interval:
# 1.53 5.14
#sample estimates:
#odds ratio 
#      2.79 


#[[3]]

#   Fisher's Exact Test for Count Data

#data:  cur_data()
#p-value = 0.1
#alternative hypothesis: true odds ratio is not equal to 1
#95 percent confidence interval:
# 0.269 1.154
#sample estimates:
#odds ratio 
#     0.555

在基础 R 中，您可以使用 by 或 split + lapply ：

by(df[3:4], df$criteria, fisher.test)
#OR
lapply(split(df[3:4], df$criteria), fisher.test)

【讨论】：

谢谢！ Base R 解决方案最好，因为它给出了结果的标准。