对于循环 t.test，按 R 中的因子类比较均值答案

【问题标题】：For Loop t.test, Comparing Means by Factor Class in R对于循环 t.test，按 R 中的因子类比较均值
【发布时间】：2019-01-12 07:53:06
【问题描述】：

我想循环大量单边 t.test，按模式比较一组不同作物的平均作物收成值。

我的数据结构如下：


df <- data.frame("crop" = rep(c('Beans', 'Corn', 'Potatoes'), 10),
                 "value" = rnorm(n = 30),
                 "pattern" = rep(c("mono", "inter"), 15),
                 stringsAsFactors = TRUE)

我希望输出提供 t.test 的结果，按模式比较每种作物的平均收成（即比较单作马铃薯与间作马铃薯的收成），其中另一种方法是间作模式的价值更大。

救命！

【问题讨论】：

标签： r for-loop t-test

【解决方案1】：

这是一个使用基础 R 的示例。

# Generate example data
df <- data.frame("crop" = rep(c('Beans', 'Corn', 'Potatoes'), 10),
                 "value" = rnorm(n = 30),
                 "pattern" = rep(c("inter", "mono"), 15),
                 stringsAsFactors = TRUE)

# Create a list which will hold the output of the test for each crop
  crops <- unique(df$crop)
  test_output <- vector('list', length = length(crops))
  names(test_output) <- crops

# For each crop, save the output of a one-sided t-test
  for (crop in crops) {
    # Filter the data to include only observations for the particular crop
    crop_data <- df[df$crop == crop,]
    # Save the results of a t-test with a one-sided alternative
    test_output[[crop]] <- t.test(formula = value ~ pattern,
                                  data = crop_data,
                                  alternative = 'greater')
  }

需要注意的是，当使用公式接口（例如 y ~ x）调用 t-test 并且您的自变量是一个因子时，使用设置 alternative = 'greater' 将测试较低因子水平的平均值（对于您的数据，"inter") 大于较高因子水平的平均值（此处为 "mono"）。

【讨论】：

使用by可以消除unique、vector、names、for和[行！
这是一个很好的建议。我认为将其添加为问题的答案会很有价值。

【解决方案2】：

这是优雅的“tidyverse”方法，它利用了 broom 中的 tidy 函数，它允许您将 t 检验的输出存储为数据框。

dplyr 包中的 group_by 和 do 函数不是正式的 for 循环，而是用于完成与 for 循环相同的事情。

library(dplyr)
library(broom)

# Generate example data
  df <- data.frame("crop" = rep(c('Beans', 'Corn', 'Potatoes'), 10),
                   "value" = rnorm(n = 30),
                   "pattern" = rep(c("inter", "mono"), 15),
                   stringsAsFactors = TRUE)

# Group the data by crop, and run a t-test for each subset of data.
# Use the tidy function from the broom package
# to capture the t.test output as a data frame

  df %>% 
    group_by(crop) %>% 
    do(tidy(t.test(formula = value ~ pattern,
                   data = .,
                   alternative = 'greater')))

【讨论】：

是的！我很早就接近这个了。感谢您的帮助。
当然。祝您好运解决您的分析问题！

【解决方案3】：

考虑by，tapply 的面向对象包装器，旨在按因子对数据帧进行子集化并在子集上运行操作：

t_test_list <- by(df, df$crop, function(sub) 
                   t.test(formula = value ~ pattern,
                          data = sub, alternative = 'greater')
                 )

【讨论】：