【发布时间】:2019-04-22 19:28:17
【问题描述】:
我想创建一个新变量来指示 visit_date 是否在为 id 列出的任何日期范围内
我已使用此代码逐行比较,但我想扩展此代码并将 id 的所有行与为该 id 列出的所有间隔行进行比较
df <- df %>%
group_by(id) %>%
mutate(between_any = ifelse((visit_date >= start & visit_date <= end), 1))
我也尝试过在变异之前创建一个区间变量并使用crossing(visit_date, interval),但是我无法让cross为日期对象工作。
以下是一些示例数据:
df <- data.frame(id = c("a","a","a","a","a","b","b","b"),
visit_date = c("2001-08-22","2001-09-21","2001-10-30","2001-11-10","2001-12-20","2002-12-22", "2003-04-30","2003-05-10"),
start = c(NA,"2001-09-21",NA,"2001-11-10",NA,"2002-12-22", "2003-04-30",NA),
end = c(NA, "2001-11-01",NA,"2001-11-10",NA,"2002-12-22","2003-06-01",NA))
> df
id visit_date start end
a 2001-08-22 <NA> <NA>
a 2001-09-21 2001-09-21 2001-11-01
a 2001-10-30 <NA> <NA>
a 2001-11-10 2001-11-10 2001-11-10
a 2001-12-20 <NA> <NA>
b 2002-12-22 2002-12-22 2002-12-22
b 2003-04-30 2003-04-30 2003-06-01
b 2003-05-10 <NA> <NA>
我想要的输出如下:
id visit_date start end between_any
a 2001-08-22 <NA> <NA> 0
a 2001-09-21 2001-09-21 2001-11-01 1
a 2001-10-30 <NA> <NA> 1
a 2001-11-10 2001-11-10 2001-11-10 1
a 2001-12-20 <NA> <NA> 0
b 2002-12-22 2002-12-22 2002-12-22 1
b 2003-04-30 2003-04-30 2003-06-01 1
b 2003-05-10 <NA> <NA> 1
提前致谢!
【问题讨论】: