【发布时间】:2015-10-07 18:41:52
【问题描述】:
我有一个正在使用的示例数据框
Datetime <- c("2015-09-29 08:22:00", "2015-09-29 09:45:00", "2015-09-29 09:53:00", "2015-09-29 10:22:00", "2015-09-29 10:42:00",
"2015-09-29 11:31:00", "2015-09-29 11:47:00", "2015-09-29 12:45:00", "2015-09-29 13:11:00", "2015-09-29 13:44:00",
"2015-09-29 15:24:00", "2015-09-29 16:28:00", "2015-09-29 20:22:00", "2015-09-29 21:38:00", "2015-09-29 23:34:00")
Measurement <- c("Length","Length","Width","Height","Width","Height","Length","Width","Width","Height","Width","Length",
"Length","Height","Height")
PASSFAIL <- c("PASS","PASS","FAIL","PASS","PASS","FAIL_AVG_HIGH","FAIL#Pts","FAIL","FAIL_AVG_LOW","FAIL","PASS","PASS","FAIL#RNG#HIGH","PASS","FAIL")
df1 <- data.frame(Datetime,Measurement,PASSFAIL)
df1
Datetime Measurement PASSFAIL
1 2015-09-29 08:22:00 Length PASS
2 2015-09-29 09:45:00 Length PASS
3 2015-09-29 09:53:00 Width FAIL
4 2015-09-29 10:22:00 Height PASS
5 2015-09-29 10:42:00 Width PASS
6 2015-09-29 11:31:00 Height FAIL_AVG_HIGH
7 2015-09-29 11:47:00 Length FAIL#Pts
8 2015-09-29 12:45:00 Width FAIL
9 2015-09-29 13:11:00 Width FAIL_AVG_LOW
10 2015-09-29 13:44:00 Height FAIL
11 2015-09-29 15:24:00 Width PASS
12 2015-09-29 16:28:00 Length PASS
13 2015-09-29 20:22:00 Length FAIL#RNG#HIGH
14 2015-09-29 21:38:00 Height PASS
15 2015-09-29 23:34:00 Height FAIL
我正在研究一个有趣的问题,以找出每天 12AM-12PM 和 12PM-12AM(第二天)每次测量的失败率。
注意:在 df1 中,PASSFAIL 列中有 FAIL 的任何内容都被视为失败。
Fail Rate = (Number of Fails)/(Number of Fails + Number of Pass)
我想要的输出是这样的
Datetime FailRate_length Total_length FailRate_Width Total_Width FailRate_Height Total_Height
1 2015-09-29 00:00:00 AM 0.33 3 0.50 2 0.50 2
2 2015-09-29 12:00:00 PM 0.50 2 0.66 3 0.66 3
我正在尝试使用 dplyr 和 data.table 包来解决这个问题,但我只是不知道如何划分 df1 中的时间间隔以获得具有 2 个值的 df2 -> 12AM(df1 的前 7 个观察值)&中午 12 点(df1 中接下来的 8 个观测值)。有人可以帮我解决这个问题吗?
【问题讨论】:
标签: r data.table dplyr reshape2