【问题标题】:cumsum with date and categorical variable in RR中带有日期和分类变量的cumsum
【发布时间】:2020-05-19 05:33:02
【问题描述】:

我有这个数据集:

df <- data.frame(Date = c("12-01-2019","12-01-2019","12-02-2019","12-02-2019","12-02-2019","12-03-2019"),
                 Country = c("France","USA","France","USA","Colombia","USA")).

我想用 dplyr 申请 cumsum 并得到这个结果:

Date          Country cumsum
"12-01-2019" "France"   1
"12-01-2019" "USA"      1
"12-01-2019" "Colombia" 0
"12-02-2019" "France"   2
"12-02-2019" "USA"      2
"12-02-2019" "Colombia" 1
"12-03-2019" "France"   2
"12-03-2019" "USA"      3
"12-03-2019" "Colombia" 1

有什么建议吗?

非常感谢您的帮助。

您好!

【问题讨论】:

  • 你想求和的值是从哪里来的?`
  • 为什么哥伦比亚从 0 开始,而其他国家从 1 开始?
  • 因为12月1日没有案例。

标签: r dplyr cumsum


【解决方案1】:

我们可以count 为每个DateCountry 组合的行数,complete 每个Country 的缺失日期并将计数添加为0。最后,对于每个Country,我们可以取@ 987654327@.

library(dplyr)

df %>%
  mutate(Date = lubridate::mdy(Date)) %>%
  count(Date, Country) %>%
  tidyr::complete(Country, Date = seq(min(Date), max(Date), by = 'day'), 
                  fill = list(n = 0)) %>%
  group_by(Country) %>%
  mutate(n  = cumsum(n))


#  Country  Date           n
#  <chr>    <date>     <dbl>
#1 Colombia 2019-12-01     0
#2 Colombia 2019-12-02     1
#3 Colombia 2019-12-03     1
#4 France   2019-12-01     1
#5 France   2019-12-02     2
#6 France   2019-12-03     2
#7 USA      2019-12-01     1
#8 USA      2019-12-02     2
#9 USA      2019-12-03     3

【讨论】:

  • 非常感谢您的支持!问候!
【解决方案2】:

我们可以使用data.table 方法,应该很快

library(data.table)
library(tidyr)
setDT(df)[, Date := as.IDate(Date, "%m-%d-%Y")][,
  .N, .(Date, Country)][CJ(Date, Country, unique = TRUE),
  on = .(Date, Country)][,  N := cumsum(replace_na(N, 0)),Country][]
#         Date  Country N
#1: 2019-12-01 Colombia 0
#2: 2019-12-01   France 1
#3: 2019-12-01      USA 1
#4: 2019-12-02 Colombia 1
#5: 2019-12-02   France 2
#6: 2019-12-02      USA 2
#7: 2019-12-03 Colombia 1
#8: 2019-12-03   France 2
#9: 2019-12-03      USA 3

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-05-14
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多