【发布时间】:2019-07-23 15:41:29
【问题描述】:
我正在尝试计算 12 个月滚动窗口内积极事件的数量。
我可以每年创建 365 行缺失数据并使用 zoo::rollapply 来计算每 365 行数据的事件数,但我的数据框非常大,我想在一堆变量上执行此操作,所以这需要永远运行。
我可以用这个得到正确的输出:
data <- data.frame(id = c("a","a","a","a","a","b","b","b","b","b"),
date = c("20-01-2011","20-04-2011","20-10-2011","20-02-2012",
"20-05-2012","20-01-2013","20-04-2013","20-10-2013",
"20-02-2014","20-05-2014"),
event = c(0,1,1,1,0,1,0,0,1,1))
library(lubridate)
library(dplyr)
library(tidyr)
library(zoo)
data %>%
group_by(id) %>%
mutate(date = dmy(date),
cumsum = cumsum(event)) %>%
complete(date = full_seq(date, period = 1), fill = list(event = 0)) %>%
mutate(event12 = rollapplyr(event, width = 365, FUN = sum, partial = TRUE)) %>%
drop_na(cumsum)
这是什么:
id date event cumsum event12
<fct> <date> <dbl> <dbl> <dbl>
a 2011-01-20 0 0 0
a 2011-04-20 1 1 1
a 2011-10-20 1 2 2
a 2012-02-20 1 3 3
a 2012-05-20 0 3 2
b 2013-01-20 1 1 1
b 2013-04-20 0 1 1
b 2013-10-20 0 1 1
b 2014-02-20 1 2 1
b 2014-05-20 1 3 2
但想看看是否有更有效的方法,例如如何使rollyapply 中的宽度计数日期而不是计数行数。
【问题讨论】:
标签: r dplyr tidyverse zoo rollapply