【发布时间】:2016-10-21 00:46:27
【问题描述】:
我有一个个人特征表,例如:
person <- data.frame(group.id = c("N","N","P"), person.id = c("A", "B", "C"), strt = c(as.Date(x = "2002-07-01"), as.Date(x = "2003-08-01"), as.Date(x = "2004-06-23")), end = c(as.Date(x = "2003-08-01"), as.Date(x = "2004-09-01"), as.Date(x = "2006-07-01")), c = 1:3, d = 3:5)
以及一组组特征的组表,例如:
group <- data.frame(group.id = c("N", "N", "N", "O", "O", "O", "P", "P", "P"), report.date = c(as.Date(x = "2002-07-01"), as.Date(x = "2002-08-01"), as.Date(x = "2002-09-01")), a = c(1:3), b = c(4:6))
我想按 group.id 和适用的日期范围合并它们,例如:
group2 <- data.frame(group, person.id = c("A", "A", "A", NA, NA, NA, NA, NA, NA), strt = c(as.Date(x = "2002-07-01"), as.Date(x = "2002-07-01"), as.Date(x = "2002-07-01"), NA, NA, NA, NA, NA, NA), end = c(as.Date(x = "2003-08-01"), as.Date(x = "2003-08-01"), as.Date(x = "2003-08-01"), NA, NA, NA, NA, NA, NA), c = c(1, 1, 1, NA, NA, NA, NA, NA, NA), d = c(3, 3, 3, NA, NA, NA, NA, NA, NA))
group.id report.date a b person.id strt end c d 1 N 2002-07-01 1 4 A 2002-07-01 2003-08-01 1 3 2 N 2002-08-01 2 5 A 2002-07-01 2003-08-01 1 3 3 N 2002-09-01 3 6 A 2002-07-01 2003-08-01 1 3 4 O 2002-07-01 1 4 <NA> <NA> <NA> NA NA 5 O 2002-08-01 2 5 <NA> <NA> <NA> NA NA 6 O 2002-09-01 3 6 <NA> <NA> <NA> NA NA 7 P 2002-07-01 1 4 <NA> <NA> <NA> NA NA 8 P 2002-08-01 2 5 <NA> <NA> <NA> NA NA 9 P 2002-09-01 3 6 <NA> <NA> <NA> NA NA
有没有人建议如何在 R 中做到这一点?
【问题讨论】:
-
你可以使用
data.table和this post -
HI @Hack-R ,我使用公共变量
group.id合并person和group并检查group$report.date是否介于person$strtperson$end范围之间。在我的示例中,我保留了第 4 行及以下 NA 的值,因为它们在示例person表中没有对应的值。 -
@timothy.s.lau 谢谢,我刚刚根据该解释更新了我的答案。请看看它是否能解决您的问题。