【发布时间】:2016-02-15 07:12:17
【问题描述】:
我有 2 个这样的数据框:
df1
ID <- c("A","B","A","C","C","B","B","A")
StartDatetime <- c("2015-09-29 00:00:13", "2015-09-29 05:55:50", "2015-09-29 11:45:14", "2015-09-29 15:24:00",
"2015-09-29 17:24:12", "2015-09-29 21:34:31", "2015-09-29 22:22:22", "2015-09-29 23:38:22")
EndDatetime <- c("2015-09-29 00:13:56", "2015-09-29 06:13:50", "2015-09-29 12:23:14", "2015-09-29 15:58:00",
"2015-09-29 17:58:17", "2015-09-29 22:06:31", "2015-09-29 22:52:28", "2015-09-29 23:55:22")
MEASUREMENT <- c("Length","Length","Width","Length","Width","Height","Length","Height")
df1 <- data.frame(ID,StartDatetime,EndDatetime,MEASUREMENT)
df2
ID <- c("A","B","A","C","C","B","B")
MStart <- c("09/29/2015 00:02:13", "09/29/2015 05:56:50", "09/30/2015 11:55:14", "09/29/2015 15:33:00",
"09/29/2015 17:28:12", "09/29/2015 21:30:31", "09/29/2015 22:26:22")
MEnd <- c("09/29/2015 00:11:12", "09/29/2015 06:55:50", "09/30/2015 11:54:14", "09/29/2015 15:47:00",
"09/29/2015 17:44:12", "09/29/2015 22:02:31", "09/29/2015 22:44:22")
Measurement <- c("Length","Length","Width","Length","Width","Height","Length")
df2 <- data.frame(ID,MStart,MEnd,Measurement)
我正在尝试解决一个有趣的问题,即检查 df2 中具有 MStart 和 MEnd 的 ID 是否在每次测量的 df1 日期时间范围内。逻辑回归
TRUE if (MStart & MEnd) **is within** (StartDatetime & EndDatetime)
FALSE if (MStart & MEnd) **is not within** (StartDatetime & EndDatetime)
我想要的输出将是 df3,其中包含 df1 中的所有列,并添加一个包含 True 或 False 值的列。
df3
ID StartDatetime EndDatetime MEASUREMENT True_False
1 A 2015-09-29 00:00:13 2015-09-29 00:13:56 Length TRUE
2 B 2015-09-29 05:55:50 2015-09-29 06:13:50 Length FALSE
3 A 2015-09-29 11:55:14 2015-09-29 12:23:14 Width FALSE
4 C 2015-09-29 15:24:00 2015-09-29 15:58:00 Length TRUE
5 C 2015-09-29 17:24:12 2015-09-29 17:58:17 Width TRUE
6 B 2015-09-29 21:34:31 2015-09-29 22:06:31 Height FALSE
7 B 2015-09-29 22:22:22 2015-09-29 22:52:28 Length TRUE
8 A 2015-09-29 23:38:22 2015-09-29 23:55:22 Height FALSE
我在尝试转换 df2 的日期格式时遇到此错误,无法继续前进。
**df2$MStart <- as.POSIXct(df2$MStart,"%Y-%m-%d %H:%M:%S")**
Error in as.POSIXlt.character(as.character(x), ...) :
character string is not in a standard unambiguous format
请指导我如何解决这个问题。我正在尝试使用 dplyr 或 data.table 来解决这个问题,但不知道使用日期时间的逻辑。
编辑 我刚刚进行了编辑并删除了 df2 的最后一行,因此它现在只有 7 行。我也想解决这种情况,因为我更大的数据集在 df1 中包含更多行,在 df2 中包含更少行,因此也希望从 df1 返回那些不匹配的行和 FALSE。
【问题讨论】:
-
感谢您使用可重复的数据组织您的问题,解释您想要什么、您尝试过什么以及您想要的输出。是 13 号星期五还是什么?哦,是的:)
-
大声笑是的,确实如此:D