【发布时间】:2020-04-16 03:46:32
【问题描述】:
我有以下示例数据:
require(tibble)
sample_data <- tibble(
emp_name = c("john", "john", "john", "john","john","john", "john"),
task = c("carpenter", "carpenter","carpenter", "painter", "painter", "carpenter", "carpenter"),
date_stamp = c("2019-01-01","2019-01-02", "2019-01-03", "2019-01-07", "2019-01-08", "2019-01-30", "2019-02-02")
)
为此,我需要根据日期汇总成间隔。
规则是:如果为同一属性列出的下一个 date_stamp 之间没有日期,则应将其汇总。 否则,date_stamp_from 和 date_stamp_to 应该等于 date_stamp。
desired_result <- tibble(
emp_name = c("john", "john","john", "john"),
task = c("carpenter","painter", "carpenter", "carpenter"),
date_stamp_from = c("2019-01-01","2019-01-07", "2019-01-30", "2019-02-02"),
date_stamp_to = c("2019-01-03","2019-01-08", "2019-01-30", "2019-02-02"),
count_dates = c(3,2,1,1)
)
解决这个问题的最有效方法是什么?原始数据集大约有 10000 条记录。
【问题讨论】:
标签: r aggregate-functions intervals