基于另一列的最小值和最大值并结合 r答案

【问题标题】：Min and max value based on another column and combine those in r基于另一列的最小值和最大值并结合 r
【发布时间】：2020-01-12 04:19:14
【问题描述】：

所以我基本上得到了一个 while 循环函数，它根据“百分比”列中的最高百分比在“算法列”中创建 1，直到达到某个总百分比（90% 左右）。其余未考虑的行在“algorithm_column”（Create while loop function that takes next largest value untill condition is met）中的值为 0

我想根据循环函数找到的内容来显示“时间间隔”列的最小和最大时间（最小值是 1 的开始位置，最大值是带 1 的最后一行，0 不在范围）。然后最终从中创建一个时间间隔。

因此，如果我们有以下代码，我想在另一列中创建，让我们说“total_time”从最小时间 09:00（这是算法列中 1 开始的位置）到 11:15 的计算，这使得在“total_time”列中添加了 02:15 小时的时间间隔。

algorithm
#    pc4 timeinterval stops percent idgroup algorithm_column
#1  5464     08:45:00     1  1.3889       1                0
#2  5464     09:00:00     5  6.9444       2                1
#3  5464     09:15:00     8 11.1111       3                1
#4  5464     09:30:00     7  9.7222       4                1
#5  5464     09:45:00     5  6.9444       5                1
#6  5464     10:00:00    10 13.8889       6                1
#7  5464     10:15:00     6  8.3333       7                1
#8  5464     10:30:00     4  5.5556       8                1
#9  5464     10:45:00     7  9.7222       9                1
#10 5464     11:00:00     6  8.3333      10                1
#11 5464     11:15:00     5  6.9444      11                1
#12 5464     11:30:00     8 11.1111      12                0

我有多个 pc4 组，所以它应该查看每个组并分别计算每个组的 total_time。

我有这个功能，但如果这是我需要的，我有点卡住了。

test <- function(x) {
  ind <- x[["algorithm$algorithm_column"]] == 0
  Mx <- max(x[["timeinterval"]][ind], na.rm = TRUE);
  ind <- x[["algorithm$algorithm_column"]] == 1
  Mn <- min(x[["timeinterval"]][ind], na.rm = TRUE);
  list(Mn, Mx)  ## or return(list(Mn, Mx))
}

test(algorithm)

【问题讨论】：

我不明白您的数据结构：您将列表传递为x，而 data.framealgorithm 是列表的成员之一？

标签： r max conditional-statements min

【解决方案1】：

这是dplyr 解决方案。

library(dplyr)

algorithm %>%
  mutate(tmp = cumsum(c(0, diff(algorithm_column) != 0))) %>%
  filter(algorithm_column == 1) %>%
  group_by(pc4, tmp) %>%
  summarise(first = first(timeinterval),
            last = last(timeinterval)) %>%
  select(-tmp)
## A tibble: 1 x 3
## Groups:   pc4 [1]
#    pc4 first    last    
#  <int> <fct>    <fct>   
#1  5464 09:00:00 11:15:00

数据。

algorithm <- read.table(text = "
    pc4 timeinterval stops percent idgroup algorithm_column
1  5464     08:45:00     1  1.3889       1                0
2  5464     09:00:00     5  6.9444       2                1
3  5464     09:15:00     8 11.1111       3                1
4  5464     09:30:00     7  9.7222       4                1
5  5464     09:45:00     5  6.9444       5                1
6  5464     10:00:00    10 13.8889       6                1
7  5464     10:15:00     6  8.3333       7                1
8  5464     10:30:00     4  5.5556       8                1
9  5464     10:45:00     7  9.7222       9                1
10 5464     11:00:00     6  8.3333      10                1
11 5464     11:15:00     5  6.9444      11                1
12 5464     11:30:00     8 11.1111      12                0
", header = TRUE)

【讨论】：