dplyr 过滤器不适用于 lubridate 时间格式？答案

【问题标题】：dplyr's filter not working on lubridate's timeformats?dplyr 过滤器不适用于 lubridate 时间格式？
【发布时间】：2018-03-03 20:00:52
【问题描述】：

在尝试回答 this question 时，我遇到了在 lubridat-period 列上使用来自 dplyr-package 的 filter 的问题。

示例数据：

df <- data.frame(time = ms(c('0:19','1:24','7:53','11:6')), value = 1:4)

使用：

filter(df, time > ms('5:00'))
# or:
filter(df, time > '5M 00S')

导致错误的输出：

   time value
1   53S     3
2 1M 6S     4
Warning message:
In format.data.frame(x, digits = digits, na.encode = FALSE) :
  corrupt data frame: columns will be truncated or padded with NAs

从this answer 应用解决方案也不会产生正确的输出：

> df %>% 
+   mutate(time = format(time, '%M:%S')) %>% 
+   filter(time > '05:00')
    time value
1    19S     1
2 1M 24S     2
3 7M 53S     3
4 11M 6S     4

但是使用 vanilla R 方法，做工作：

> df[df$time > ms('5:00'), ]
    time value
3 7M 53S     3
4 11M 6S     4

> subset(df, time > ms('5:00'))
    time value
3 7M 53S     3
4 11M 6S     4

在我的dplyr 方法中我做错了什么吗？

【问题讨论】：

澄清一下，问题是选择了正确的行（3 和 4），但时间列已更改？
是的，我试过了，似乎是在互换 M 和 S
这可能与在其中定义日期结构的小标题和限制有关吗？
而且github上已经有一个已关闭的问题github.com/tidyverse/dplyr/issues/2520
公开问题here。似乎这种数据类型的子集出错了，没有警告（至少在我的情况下）。

标签： r dplyr lubridate

【解决方案1】：

在尝试了很多不同的方法后，我得到了一个 dplyr 唯一的解决方案：

df %>% 
  mutate(time = as.numeric(time)) %>% 
  filter(time > as.numeric(ms('5:00'))) %>% 
  mutate(time = ms(paste0(floor(time/60),':',round((time/60 - floor(time/60))*60))))

这会产生好的结果：

    time value
1 7M 53S     3
2 11M 6S     4

【讨论】：