【问题标题】:R: count consecutive occurrences of values in a single column and by groupR:计算单个列和按组中连续出现的值
【发布时间】:2019-02-07 20:05:49
【问题描述】:

我正在尝试创建一个连续数量的相等值,即出现次数。但是,我希望在引入新 ID 后重置计数,即使该行保持连续。

我的数据的示例:

dataset <- data.frame(ID = 
c("a","a","a","a","a","a","a","b","b","b","b","b","b","b")
dataset$YesNO <- c(1,1,0,0,0,1,1,1,1,1,0,0,0,0)

所以我想创建一个新列,结果在:

c(1,2,1,2,3,1,2,1,2,3,1,2,3,4)

我使用了我在这个论坛上找到的这段代码:

dataset$Counter <- sequence(rle(as.character(dataset$YesNo))$lengths)

但是,这不会重置新 ID 号的计数。相反,顺序计数继续,结果输出是:

c(1,2,1,2,3,1,2,3,4,5,1,2,3,4)

我缺少哪个步骤来根据 ID 重置它。

谢谢!

【问题讨论】:

    标签: r count sequence


    【解决方案1】:

    使用rleid(来自data.table 包)获取分组变量,然后使用ave 在该分组的常用值中应用seq_along

    library(data.table)
    transform(dataset, Counter = ave(YesNO, rleid(ID, YesNO), FUN = seq_along))
    

    给予:

       ID YesNO Counter
    1   a     1       1
    2   a     1       2
    3   a     0       1
    4   a     0       2
    5   a     0       3
    6   a     1       1
    7   a     1       2
    8   b     1       1
    9   b     1       2
    10  b     1       3
    11  b     0       1
    12  b     0       2
    13  b     0       3
    14  b     0       4
    

    【讨论】:

      【解决方案2】:

      还有一个dplyr可能:

      dataset %>%
       group_by(ID, grp = with(rle(YesNO), rep(seq_along(lengths), lengths))) %>%
       mutate(Counter = seq_along(grp)) %>%
       ungroup() %>%
       select(-grp)
      
         ID    YesNO Counter
         <fct> <dbl>   <int>
       1 a        1.       1
       2 a        1.       2
       3 a        0.       1
       4 a        0.       2
       5 a        0.       3
       6 a        1.       1
       7 a        1.       2
       8 b        1.       1
       9 b        1.       2
      10 b        1.       3
      11 b        0.       1
      12 b        0.       2
      13 b        0.       3
      14 b        0.       4
      

      或者:

      dataset %>%
       group_by(ID, grp = with(rle(YesNO), rep(seq_along(lengths), lengths))) %>%
       mutate(Counter = 1:n()) %>%
       ungroup() %>%
       select(-grp)
      

      【讨论】:

        【解决方案3】:

        你可以这样做:

        dataset$Counter <- with(dataset,
                                ave(YesNO, ID, FUN = function(x) sequence(rle(as.character(x))$lengths)))
        

        输出:

           ID YesNO Counter
        1   a     1       1
        2   a     1       2
        3   a     0       1
        4   a     0       2
        5   a     0       3
        6   a     1       1
        7   a     1       2
        8   b     1       1
        9   b     1       2
        10  b     1       3
        11  b     0       1
        12  b     0       2
        13  b     0       3
        14  b     0       4
        

        【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-05-08
        • 2016-02-06
        • 2019-09-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-11-07
        • 1970-01-01
        相关资源
        最近更新 更多