【发布时间】:2021-11-29 06:10:19
【问题描述】:
我正在处理时间戳数据框。数据框一月份样本中与日期相关的变量的摘录:
sample_dates <- data.frame(date = c("2021-01-01", "2021-01-02", "2021-01-03", "2021-01-04", "2021-01-05", "2021-01-06", "2021-01-07", "2021-01-08", "2021-01-09", "2021-01-10", "2021-01-11", "2021-01-12", "2021-01-13", "2021-01-14", "2021-01-15", "2021-01-16", "2021-01-17", "2021-01-18", "2021-01-19", "2021-01-20", "2021-01-21", "2021-01-22", "2021-01-23", "2021-01-24", "2021-01-25", "2021-01-26", "2021-01-27", "2021-01-28", "2021-01-29", "2021-01-30", "2021-01-31"))
sample_dates <- sample_dates %>%
mutate(date = as.POSIXct(date)) %>%
mutate(day = factor(format(date, "%a")))
我想添加一个新的因子变量day_cat,其伪代码可能是这样的:
sample_dates <- sample_dates %>%
# the month could start on any day and this function should identify it
# for the sample, I know January 2021 started on Friday
mutate(day_cat = while(month is not over)
if(day == "Fri") {"Fri1"},
else if(day == "Sat" | day == "Sun") {"Weekend1"},
else if(day == "Mon") {"Mon1"},
else if(day == "Tue" | day == "Wed" | day == "Thu") {"Weekdays1"},
# now we're onto the next Friday of the month
else if(day == "Fri") {"Fri2"},
else if(day == "Sat" | day == "Sun") {"Weekend2"},
else if(day == "Mon") {"Mon2"},
else if(day == "Tue" | day == "Wed" | day == "Thu") {"Weekdays2"},
...
...
# reached the end of month
)
mutate(day_cat = factor(day_cat, levels = c("Mon", "Weekdays", "Fri", "Weekend")))
所以,因子是 Mon = {Mon};工作日 = {周二、周三、周四};周五 = {周五};周末 = {周六、周日}。而且,我想在 day_cat 变量中将这些因素编号为 Mon1、Weekdays1、Fri1、Weekend1、Mon2、Weekdays2、Fri1、Weekend2、Mon3 等等(假设月份从星期一开始)。
day_cat 变量的级别应该是相同的顺序(用于绘图目的)。
如果一个月从星期三开始,day_cat 将只将该星期三和星期四(第二天)作为“Weekdays1”。如果该月在星期六结束,day_cat 将只将该星期六作为“Weekend4”或“Weekend5”,无论我可能是哪个。
【问题讨论】:
标签: r function date dplyr timestamp