【问题标题】:How to summarise data based on median, creating fast and slow columns in R如何根据中位数汇总数据,在 R 中创建快列和慢列
【发布时间】:2021-09-01 20:32:42
【问题描述】:

我有一个具有多个 id 的数据框,它们都有三个条件和相应的数据点 (ReacTime)。

|ID|Condition|ReacTime|
|1 | Cong    |537     |
|1 | Incong  |541     |
|1 | Cong    |500     |
|1 | Cong    |520     |
|1 | Incong  |537     |
|1 | Cong    |599     |
|2 | Cong    |650     |
|2 | Incong  |708     |
|2 | Cont    |672     |
|2 | Cong    |676     |
|2 | Incong  |822     |
|2 | Cont    |609     |
|3 | Cong    |630     |
|3 | Incong  |725     |
|3 | Cont    |680     |
|3 | Cong    |625     |
|3 | Incong  |700     |
|3 | Cont    |620     |

我找到了每个 ID 的 ReacTime 的中值,现在我必须为每个 ID 获取一个慢速和快速值。平均中位数之前的每个条件的所有值(慢)和平均中位数之后的所有值(快)。

我对中值使用了 summarise 函数:

Df2<- summarise(group_by(Df1, ID),medianvalue = median(ReacTime))

对于快速和慢速我尝试了分位数:

 Df2 <- summarise(group_by(Df2, ID,Condition), 
                            Slow = quantile(ReacTime, probs = 0.5), 
                            Fast = quantile(ReacTime, probs = ?).

我不知道该为我的快速问题添加什么?

【问题讨论】:

    标签: r median summarize


    【解决方案1】:

    您可以在相同的summarise 代码中计算这个 -

    library(dplyr)
    
    df %>%
      group_by(ID) %>%
      summarise(medianvalue = median(ReacTime), 
                Slow = mean(ReacTime[ReacTime < medianvalue]), 
                Fast = mean(ReacTime[ReacTime > medianvalue]))
    
    #     ID medianvalue  Slow  Fast
    #  <int>       <dbl> <dbl> <dbl>
    #1     1         537  510   570 
    #2     2         674  644.  735.
    #3     3         655  625   702.
    

    【讨论】:

      【解决方案2】:

      使用data.table

      library(data.table)
      setDT(df)[, {
                   v1 <- median(ReacTime)
                   .(medianvalue = v1, Slow = mean(ReacTime[ReacTime < v1]),
                     Fast = mean(ReacTime[ReacTime > v1]))
                 }, .(ID)]
      

      -输出

      ID medianvalue     Slow     Fast
      1:  1         537 510.0000 570.0000
      2:  2         674 643.6667 735.3333
      3:  3         655 625.0000 701.6667
      

      数据

      df <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
      2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), Condition = c("Cong", "Incong", 
      "Cong", "Cong", "Incong", "Cong", "Cong", "Incong", "Cont", "Cong", 
      "Incong", "Cont", "Cong", "Incong", "Cont", "Cong", "Incong", 
      "Cont"), ReacTime = c(537L, 541L, 500L, 520L, 537L, 599L, 650L, 
      708L, 672L, 676L, 822L, 609L, 630L, 725L, 680L, 625L, 700L, 620L
      )), class = "data.frame", row.names = c(NA, -18L))
      

      【讨论】:

        猜你喜欢
        • 2017-12-31
        • 1970-01-01
        • 2021-04-23
        • 1970-01-01
        • 1970-01-01
        • 2018-03-24
        • 1970-01-01
        • 2021-10-13
        • 2011-12-05
        相关资源
        最近更新 更多