【发布时间】:2021-09-16 15:24:06
【问题描述】:
我为大量用户列出了不同产品的开始日期和结束日期。不同产品的购买间隔可能重叠或有时间间隔:
user_id start_date end_date product
12 31/10/2010 31/10/2011 A
12 18/12/2010 18/12/2011 A
12 31/10/2011 28/04/2014 B
12 18/12/2011 18/12/2014 A
12 27/03/2014 27/03/2015 A
12 18/12/2014 18/12/2016 B
12 27/03/2015 27/03/2016 B
12 18/12/2016 18/12/2017 D
33 01/07/1992 01/07/2016 A
33 20/08/1993 16/08/2016 B
33 28/10/1999 15/11/2012 A
33 31/01/2006 28/02/2006 B
33 26/08/2016 26/01/2017 C
我想获得每位患者所有潜在产品组合的重叠天数。
user_id A_B A_C A_D B_C B_D C_D
12 20 days 0 days 10 days 0 days 0 days 0 days
33 10 days 0 days 0 days 0 days 20 days 20 days
是否有一种快速而优雅的编码方式,希望在 dplyr 中?
感谢您的帮助!
代码:
library(lubridate)
library(Hmisc)
library(dplyr)
user_id <- c(rep(12, 8), rep(33, 5))
start_date <- dmy(Cs(31/10/2010, 18/12/2010, 31/10/2011, 18/12/2011, 27/03/2014, 18/12/2014, 27/03/2015, 18/12/2016, 01/07/1992, 20/08/1993, 28/10/1999, 31/01/2006, 26/08/2016))
end_date <- dmy(Cs(31/10/2011, 18/12/2011, 28/04/2014, 18/12/2014, 27/03/2015, 18/12/2016, 27/03/2016, 18/12/2017,
01/07/2016, 16/08/2016, 15/11/2012, 28/02/2006, 26/01/2017))
product <- c("A", "A","B","A","A","B","B","D","A","B","A","B", "C")
data <- data.frame(user_id, start_date, end_date, product )
【问题讨论】:
-
您在第一组中有多个
A产品,我想知道您何时计算A-B的重叠天数您指的是哪一个? -
MarcBP:这是一项不同的任务,但非常感谢您阅读 Anoushiravan R:A-B 之间所有重叠的总和仍然相关
标签: r