【问题标题】:Keeping maximum and minimum values in a logical sequence按逻辑顺序保持最大值和最小值
【发布时间】:2020-10-21 21:44:34
【问题描述】:

考虑这些数据:

df <- structure(list(Date = structure(c(2922, 4018, 5113, 7305, 8035, 
12053, 14975, 16436, 17532, 17897), class = "Date"), HAM = c(1016.89391375364, 
-1269.0910012255, -1097.9927692669, -5069.52785909119, 3168.39687262048, 
-1265.24208195278, -1218.5560466457, 1463.67252927616, 1259.20509267793, 
1267.89637533522), State = c("Expansion", "Contraction", "Contraction", 
"Contraction", "Expansion", "Contraction", "Contraction", "Expansion", 
"Contraction", "Expansion"), sd = c("larger", "smaller", "smaller", 
"smaller", "larger", "smaller", "smaller", "larger", "larger", 
"larger")), row.names = c(NA, -10L), class = "data.frame")

考虑到最大值(对于扩展)和最小值(对于“收缩”),我想保持“扩展”后跟“收缩”的逻辑顺序。预期结果是这个数据框:

ndf <- structure(list(Date = structure(c(2922, 7305, 8035, 12053, 16436
), class = "Date"), HAM = c(1016.89391375364, -5069.52785909119, 
3168.39687262048, -1265.24208195278, 1463.67252927616), State = c("Expansion", 
"Contraction", "Expansion", "Contraction", "Expansion"), sd = c("larger", 
"smaller", "larger", "smaller", "larger")), row.names = c(1L, 
4L, 5L, 6L, 8L), class = "data.frame")

【问题讨论】:

  • df 中的第 9 行具有 State=='contraction' 且 sd=='larger' 和 HAM 的正值。对吗?
  • 是的,没错。
  • 为什么df 的第9 行和第10 行没有包含在ndf 中?它们似乎符合您的标准。
  • 您应该发布解决此问题的尝试。
  • 与“sd”的重复有关。我评论了你的回答。很抱歉没有提及。感谢您的回复!

标签: r dataframe if-statement conditional-statements


【解决方案1】:

关键是将顺序扩展/收缩编码为组。我的目标是rle 运行长度编码。根据您的标准,df 的第 9 行和第 10 行似乎应该包含在最终结果中。

ndf_seq <- rle(df$State)
ndf2 <- split(df, rep(seq_len(length(ndf_seq$lengths)), 
                      ndf_seq$lengths))
ndf2 <- lapply(ndf2, function(x) x[which.max(abs(x$HAM)), ])
ndf2 <- do.call(rbind, ndf2)

ndf2
         Date       HAM       State      sd
#1 1978-01-01  1016.894   Expansion  larger
#2 1990-01-01 -5069.528 Contraction smaller
#3 1992-01-01  3168.397   Expansion  larger
#4 2003-01-01 -1265.242 Contraction smaller
#5 2015-01-01  1463.673   Expansion  larger
#6 2018-01-01  1259.205 Contraction  larger
#7 2019-01-01  1267.896   Expansion  larger

这也是一个 tidyverse 解决方案:

df %>% 
  group_by(data.table::rleid(State)) %>% 
  filter(abs(HAM)==max(abs(HAM)))

【讨论】:

  • 太完美了,非常感谢!我忘了提到我不想重复“sd”。我刚刚在您的代码中添加了一行:filter(!duplicated(rleid(sd)))
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2020-06-02
  • 1970-01-01
  • 2020-11-19
  • 2023-04-02
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多