【问题标题】:How to count number of visits in data.table如何计算 data.table 中的访问次数
【发布时间】:2020-11-12 19:33:07
【问题描述】:

我需要在 R Studio 的数据表中创建一个新列,按“访问次数”对我的数据进行分类。

这是一个示例数据表:

library(data.table)
reprex_1 = data.table(
  `Receiver Number`=c("Receiver A", "Receiver B", "Receiver B","Receiver B","Receiver B", "Receiver B", "Receiver B","Receiver C", "Receiver C", "Receiver C"),
  Transmitter = c("Tag 1", "Tag 2" , "Tag 3" , "Tag 3",  "Tag 3" , "Tag 3" , "Tag 3","Tag 4" ,"Tag 4",  "Tag 4"),
  `Station Name` = c("Station A","Station B","Station B","Station B","Station B", "Station B","Station B","Station C","Station C","Station C"),
  TimeDiff = c( NA,NA,NA,221536,1114, 425,10728,110131,61,43)
)
Receiver Number  Transmitter Station Name TimeDiff
Receiver A       Tag 1       Station A       NA
Receiver B       Tag 2       Station B       NA
Receiver B       Tag 3       Station B       NA
Receiver B       Tag 3       Station B   221536
Receiver B       Tag 3       Station B     1114
Receiver B       Tag 3       Station B      425
Receiver B       Tag 3       Station B    10728
Receiver C       Tag 4       Station C   110131
Receiver C       Tag 4       Station C       61
Receiver C       Tag 4       Station C       43

我需要创建一个新的访问列,其中每次访问按接收器编号、发射器、站名和 TimeDiff 1800 或 NA 也构成新访问。我想要这个连续编号(1,2,3...)

这是我想要的:

Receiver Number Transmitter Station Name TimeDiff Visit
Receiver A       Tag 1      Station A       NA     1
Receiver B       Tag 2      Station B       NA     2
Receiver B       Tag 3      Station B       NA     3
Receiver B       Tag 3      Station B   221536     4
Receiver B       Tag 3      Station B     1114     4
Receiver B       Tag 3      Station B      425     4
Receiver B       Tag 3      Station B    10728     5
Receiver C       Tag 4      Station C   110131     6
Receiver C       Tag 4      Station C       61     6
Receiver C       Tag 4      Station C       43     6

我查看了基于分组数据对行进行分类的其他示例,并且可以让 R 根据前三列(接收器编号、发射器和站点名称)的唯一组合创建访问,但我不能弄清楚如何包含 TimeDiff >1800 的条件以启用新访问。

这是我可以达到的,但不包括通过 TimedDiff >1800 创建新访问:

require(data.table)
setDT(reprex_1)[,AttemptVisit:=.GRP, by = c("Receiver Number","Station Name", "Transmitter")]

Receiver Number Transmitter Station Name TimeDiff AttemptVisit
Receiver A       Tag 1    Station A       NA            1
Receiver B       Tag 2    Station B       NA            2
Receiver B       Tag 3    Station B       NA            3
Receiver B       Tag 3    Station B   221536            3
Receiver B       Tag 3    Station B     1114            3
Receiver B       Tag 3    Station B      425            3
Receiver B       Tag 3    Station B    10728            3
Receiver C       Tag 4    Station C   110131            4
Receiver C       Tag 4    Station C       61            4
Receiver C       Tag 4    Station C       43            4 

如果您能提供任何帮助,我将不胜感激!

【问题讨论】:

    标签: r data.table


    【解决方案1】:

    我认为这应该可行。我们使用cumsumNA 或>1800 值作为分组的一部分:

    reprex_1[, visit := .GRP,
             by = .(`Receiver Number`, Transmitter, `Station Name`, cumsum(TimeDiff > 1800 | is.na(TimeDiff)))]
    # reprex_1
    #     Receiver Number Transmitter Station Name TimeDiff visit
    #  1:      Receiver A       Tag 1    Station A       NA     1
    #  2:      Receiver B       Tag 2    Station B       NA     2
    #  3:      Receiver B       Tag 3    Station B       NA     3
    #  4:      Receiver B       Tag 3    Station B   221536     4
    #  5:      Receiver B       Tag 3    Station B     1114     4
    #  6:      Receiver B       Tag 3    Station B      425     4
    #  7:      Receiver B       Tag 3    Station B    10728     5
    #  8:      Receiver C       Tag 4    Station C   110131     6
    #  9:      Receiver C       Tag 4    Station C       61     6
    # 10:      Receiver C       Tag 4    Station C       43     6
    

    【讨论】:

    • 另一个选项是reprex_1[, visit := rleid(Receiver Number, Transmitter, Station Name, cumsum(TimeDiff > 1800 | is.na(TimeDiff)))]
    猜你喜欢
    • 2019-04-10
    • 2021-03-08
    • 1970-01-01
    • 1970-01-01
    • 2016-08-31
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多