【问题标题】:Looping TukeyHSD function through multiple columns of a dataframe通过数据帧的多列循环 TukeyHSD 函数
【发布时间】:2021-09-13 21:01:53
【问题描述】:

我正在尝试通过数据框的每一列循环 TukeyHSD 测试并比较治疗水平。这是一些模拟数据,是我拥有的数据的简化版本(我的数据有大约 350 列):

df1 <- data.frame(cmpd1 = c(500,436,1,1,1,1),
                  cmpd2 = c(1,1,1,1,1,253),
                  cmpd3 = c(1,1,300,57,150,260),
                  treatment=c("W","W","A","A","D","D"))

我已成功遵循 this post 中的建议,并创建了一个循环,对每一列运行 ANOVA,仅输出 p 值

# specific compound differences
for (i in 1:3){
  column <- names(df1[i])
  anova <- broom::tidy(aov(df1[,i] ~ treatment, data = df1))
  
  # only want aov with P < 0.07 printed
  if(anova$p.value[1] < 0.07) {
    
    print(column)
    print(anova)
  }
}

但是,我想以类似的方式在所有列上运行 TukeyHSD 测试,对于任何给定的处理比较,只输出 p 值

for (i in 1:3){
  column <- names(df1[i])
  anova <- aov(df1[,i] ~ treatment, data = df1)
  tukey <- TukeyHSD(anova)
  
  # only want tukey with P < 0.07 printed
  if(tukey[["p adj"]] < 0.07) {
    
    print(column)
    print(tukey)
  }
}

我想不出正确的方法让它只输出包含 p 值

$cmpd1
                    diff       lwr      upr     p adj
D-A          2.728484e-12 -29169.59 29169.59 1.0000000
W-A          3.637979e-12 -32278.10 32278.10 0.0001
W-D          1.484573e+04 -13620.88 43312.34 0.056

【问题讨论】:

    标签: r loops


    【解决方案1】:

    TukeyHSD 的输出是 list,从 structure 可以看出

    str(TukeyHSD(aov(df1[,1] ~ treatment, data = df1)))
    List of 1
     $ treatment: num [1:3, 1:4] -2.84e-14 4.67e+02 4.67e+02 -1.09e+02 3.58e+02 ...
      ..- attr(*, "dimnames")=List of 2
      .. ..$ : chr [1:3] "D-A" "W-A" "W-D"
      .. ..$ : chr [1:4] "diff" "lwr" "upr" "p adj"
     - attr(*, "class")= chr [1:2] "TukeyHSD" "multicomp"
     - attr(*, "orig.call")= language aov(formula = df1[, 1] ~ treatment, data = df1)
     - attr(*, "conf.level")= num 0.95
     - attr(*, "ordered")= logi FALSE
    

    我们可以提取matrix 的列表元素“治疗”,因此[[$ 将不起作用。我们可以使用[ 和列名以及, 来分隔行/列索引或名称并用any 包装,因为'p adj' 有3 个值(if 需要一个 TRUE/FALSE逻辑输入)

    for (i in 1:3){
      column <- names(df1[i])
      anova <- aov(df1[,i] ~ treatment, data = df1)
      tukey <- TukeyHSD(anova)
      
      # only want tukey with P < 0.07 printed
      if(any(tukey$treatment[, "p adj"] < 0.07)) {
        
        print(column)
         print(setNames(tukey, column))
    
      }
    }
    

    -输出

    [1] "cmpd1"
      Tukey multiple comparisons of means
        95% family-wise confidence level
    
    Fit: aov(formula = df1[, i] ~ treatment, data = df1)
    
    $cmpd1
                 diff       lwr      upr    p adj
    D-A -2.842171e-14 -109.1823 109.1823 1.000000
    W-A  4.670000e+02  357.8177 576.1823 0.000839
    W-D  4.670000e+02  357.8177 576.1823 0.000839
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2019-04-12
      • 2020-04-03
      • 2020-07-22
      • 2019-12-20
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多