【发布时间】:2018-02-27 07:32:15
【问题描述】:
原始数据如下所示,我想按访问者和时间对其进行排序,以计算行中的时间差,然后将其保存到新文件中。
visitor v_time payment items
1 Jack 1/2/2018 16:07 35 3
2 Jack 1/2/2018 16:09 160 1
3 David 1/2/2018 16:12 25 2
4 Kate 1/2/2018 16:16 3 3
5 David 1/2/2018 16:21 25 5
6 Jack 1/2/2018 16:32 85 5
7 Kate 1/2/2018 16:33 639 3
8 Jack 1/2/2018 16:55 6 2
分组和排序没问题。但是计算时差失败,文件保存也失败。
visitor <- c("Jack", "Jack", "David", "Kate", "David", "Jack", "Kate", "Jack")
v_time <- c("1/2/2018 16:07","1/2/2018 16:09","1/2/2018 16:12","1/2/2018 16:16","1/2/2018 16:21","1/2/2018 16:32","1/2/2018 16:33", "1/2/2018 16:55")
payment <- c(35,160,25,3,25,85,639,6)
items <- c(3,1,2,3,5,5,3,2)
df <- data.frame(visitor, v_time, payment, items)
df %>%
arrange(visitor, v_time) %>%
group_by(visitor) %>%
mutate(diff = strptime(v_time, "%d/%m/%Y %H:%M") - lag(strptime(v_time, "%d/%m/%Y %H:%M")), diff_secs = as.numeric(diff, units = 'secs'))
write.csv(df,"C:/output.csv", row.names = F)
我的错误是什么?正确的做法是什么?
# A tibble: 8 x 6
# Groups: visitor [3]
visitor v_time payment items diff diff_secs
<fct> <fct> <dbl> <dbl> <time> <dbl>
1 David 1/2/2018 16:12 25.0 2.00 NA NA
2 David 1/2/2018 16:21 25.0 5.00 NA NA
3 Jack 1/2/2018 16:07 35.0 3.00 NA NA
4 Jack 1/2/2018 16:09 160 1.00 NA NA
5 Jack 1/2/2018 16:32 85.0 5.00 NA NA
6 Jack 1/2/2018 16:55 6.00 2.00 NA NA
7 Kate 1/2/2018 16:16 3.00 3.00 NA NA
8 Kate 1/2/2018 16:33 639 3.00 NA NA
【问题讨论】:
-
你期待什么结果?
-
只需将
default = strptime(v_time, "%d/%m/%Y %H:%M")[1]添加到lag部分。 -
@Onyambu,我希望结果显示在 diff 和 diff_secs 列中。这 2 个新列在它保存的新文件中。
-
使用
as.POSIXct而不是strptime进行转换 -
df%>%group_by(visitor)%>%mutate(diff=c(0,diff(strptime(v_time,"%d/%m/%Y %H:%M"))))