【问题标题】:How to filter row by lowest value in a column using a for loop in R如何使用R中的for循环按列中的最小值过滤行
【发布时间】:2020-10-24 04:28:57
【问题描述】:

这并不优雅,但是对于每个文件,我想在 dvdt 第一次满足/超过 15 时过滤行。我首先过滤每个文件的 dvdt 值 >= 15。然后我尝试过滤具有最小时间值的行这个新的数据框。问题是 min(time) 返回所有文件的全局最小值,而我想确定每个文件中的最低时间值。任何帮助将不胜感激!

df <- structure(list(file = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L), .Label = c("19509002.abf", "19509007.abf"
), class = "factor"), time = c(4.800000191, 4.849999905, 4.900000095, 
4.949999809, 5, 5.050000191, 5.099999905, 5.150000095, 5.199999809, 
5.25, 5.300000191, 5.349999905, 5.400000095, 5.449999809, 5.5, 
4.849999905, 4.900000095, 4.949999809, 5, 5.050000191, 5.099999905, 
5.150000095, 5.199999809, 5.25, 5.300000191, 5.349999905, 5.400000095, 
5.449999809), V = c(-34.8815918, -29.96826172, -23.65112305, 
-16.44897461, -7.843017578, 3.234863281, 15.86914063, 27.6184082, 
37.109375, 44.18945313, 49.37744141, 52.94799805, 55.41992188, 
57.00683594, 57.80029297, -36.28540039, -31.92138672, -24.78027344, 
-16.3269043, -6.683349609, 5.310058594, 18.89038086, 31.21948242, 
40.67993164, 47.24121094, 51.63574219, 54.32128906, 55.9387207
), dvdt = c(47.6074219, 98.2666016, 126.342773, 144.042969, 172.119141, 
221.557617, 252.685547, 234.985352, 189.819336, 141.601563, 103.759766, 
71.4111328, 49.4384766, 31.7382813, 15.8691406, 27.4658203, 87.2802734, 
142.822266, 169.067383, 192.871094, 239.868164, 271.606445, 246.582031, 
189.208984, 131.225586, 87.890625, 53.7109375, 32.3486328)), row.names = c(NA, 
28L), class = "data.frame")

vthresh <- data.frame()
for (i in unique(df$file)){
  vthresh = rbind(vthresh, df %>% filter(file == i, time == min(time)))
}

【问题讨论】:

标签: r


【解决方案1】:
# filter dvdt values >= 15
dfsub <- subset(df, dvdt >= 15)

# identify the lowest time value within each file
aggregate(dfsub$time, by = list(dfsub$file), min)

给出以下输出:

       Group.1    x
1 19509002.abf 4.80
2 19509007.abf 4.85

【讨论】:

  • 谢谢,效果很好 - 如果我想输出其他行信息(比如 V),我该怎么做?
  • 一种优雅的添加其他行信息的方法是使用merge,如下:df_sub &lt;- subset(df, dvdt &gt;= 15) df_agg &lt;- aggregate(df_sub$time, by = list(df_sub$file), min) colnames(df_agg) &lt;- c('file', 'time') merge(df_agg, df_sub)
  • 它为您提供以下输出:``` 文件时间 V dvdt 1 19509002.abf 4.80 -34.88159 47.60742 2 19509007.abf 4.85 -36.28540 27.46582```
  • 抱歉,我的声望太低,无法记录我的点赞
  • 我明白了。我以为您可以将我的答案标记为已接受(绿色勾号),但我在这里也很新,所以也许不是一个选择。无论如何,很高兴我能帮忙:)
【解决方案2】:

我以 Marc 的回答为基础,下面的代码有效!

df_sub <- subset(df, dvdt >= 15)
df_agg <- aggregate(df_sub$time, by = list(df_sub$file), min) 
colnames(df_agg) <- c('file', 'time')
vthresh <- merge(df_sub, df_agg, by=c("file","time"))

【讨论】:

    猜你喜欢
    • 2021-12-29
    • 2020-10-08
    • 1970-01-01
    • 2021-06-09
    • 2019-07-28
    • 1970-01-01
    • 1970-01-01
    • 2022-11-03
    • 1970-01-01
    相关资源
    最近更新 更多