【问题标题】:how to summary the historical data based on the same ID in R如何在R中基于相同的ID汇总历史数据
【发布时间】:2018-02-22 15:56:12
【问题描述】:

我有数据:

id |result
--------
1  | a
-------
1  | b
-------
1  | c
-------
2  | e
-------
2  | f
-------
2  | g

我真正想要的数据框如下:

id |result|history
-------------------
1  | a    | 
-------------------
1  | b    | a
------------------
1  | c    | a,b
------------------
2  | e    |
------------------
2  | f    | e
-----------------
2  | g    | e,f

我尝试在 R 中使用延迟。但是,它不适用于这个。有人可以帮忙吗?

【问题讨论】:

    标签: r string concatenation row


    【解决方案1】:

    这是一个使用data.table的选项

    library(data.table)
    setDT(df1)[, history := Reduce(paste, shift(result, fill = ""), accumulate = TRUE), id]
    df1
    #   id result history
    #1:  1      a        
    #2:  1      b       a
    #3:  1      c     a b
    #4:  2      e        
    #5:  2      f       e
    #6:  2      g     e f
    

    如果我们需要, 作为分隔符

    setDT(df1)[, history := c("", Reduce(function(...) paste(..., sep= ","),
                result[-.N], accumulate = TRUE)), id]
    df1
    #   id result history
    #1:  1      a        
    #2:  1      b       a
    #3:  1      c     a,b
    #4:  2      e        
    #5:  2      f       e
    #6:  2      g     e,f
    

    【讨论】:

      【解决方案2】:
      df$History = unlist(tapply(X = df$result, INDEX = df$id, function(a)
          c("", Reduce(function(x, y) {paste(x, y, sep = ", ")},
                      head(a, -1),
                       accumulate = TRUE))))
      df
      #  id result History
      #1  1      a        
      #2  1      b       a
      #3  1      c    a, b
      #4  2      e        
      #5  2      f       e
      #6  2      g    e, f
      

      数据

      df = structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L), result = c("a", 
              "b", "c", "e", "f", "g")), .Names = c("id", "result"),
              class = "data.frame", row.names = c(NA, -6L))
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2017-01-19
        • 2019-02-26
        • 1970-01-01
        • 2021-09-04
        • 1970-01-01
        • 2020-11-07
        • 2017-10-26
        相关资源
        最近更新 更多