【发布时间】:2017-11-24 22:33:03
【问题描述】:
我的数据如下所示:
library(plyr)
dates<-data.frame(datecol=as.POSIXct(c(
"2010-04-03 03:02:38 UTC",
"2010-04-03 03:03:14 UTC",
"2010-04-20 03:05:52 UTC",
"2010-04-20 03:07:42 UTC",
"2010-04-21 03:09:38 UTC",
"2010-04-21 03:10:14 UTC",
"2010-04-21 03:12:52 UTC",
"2010-04-23 03:13:42 UTC",
"2010-04-23 03:15:42 UTC",
"2010-04-23 03:16:38 UTC",
"2010-04-23 03:18:14 UTC",
"2010-04-24 03:21:52 UTC",
"2010-04-24 03:22:42 UTC",
"2010-04-24 03:24:19 UTC",
"2010-04-24 03:25:19 UTC"
)), x = cumsum(runif(15)*10),y=cumsum(runif(15)*20))
我想将我的数据分组为 5 天的时间间隔,因此将所有相隔 5 天或更短的点归为一组。我尝试了here的建议:
gr<-ddply(dates,.(cut(datecol,"5 day",include.lowest = TRUE)),"[")
但由于某种原因,我最终得到了 3 个组而不是 2 个组,并且 04/21 和 04/23 的点分为不同的组,即使它们相隔不到 5 天。
这是我想要的:
group datecol x y
1 1 2010-04-03 03:02:38 8.112423 4.790036
2 1 2010-04-03 03:03:14 11.184709 22.903475
3 2 2010-04-20 03:05:52 17.306835 32.286891
4 2 2010-04-20 03:07:42 24.071488 38.941709
5 2 2010-04-21 03:09:38 26.451493 48.378477
6 2 2010-04-21 03:10:14 33.090645 53.148149
7 2 2010-04-21 03:12:52 38.536416 64.346574
8 2 2010-04-23 03:13:42 40.911074 79.419002
9 2 2010-04-23 03:15:42 41.977579 89.760210
10 2 2010-04-23 03:16:38 46.838773 95.266709
11 2 2010-04-23 03:18:14 48.367159 112.619969
12 2 2010-04-24 03:01:52 57.470412 113.594423
13 2 2010-04-24 03:02:42 63.202005 123.653370
14 2 2010-04-24 03:04:19 65.615348 137.184153
15 2 2010-04-24 03:25:19 75.177633 137.559003
【问题讨论】:
标签: r time time-series grouping plyr