如何使用R中的for循环按列中的最小值过滤行答案

【问题标题】：How to filter row by lowest value in a column using a for loop in R如何使用R中的for循环按列中的最小值过滤行
【发布时间】：2020-10-24 04:28:57
【问题描述】：

这并不优雅，但是对于每个文件，我想在 dvdt 第一次满足/超过 15 时过滤行。我首先过滤每个文件的 dvdt 值 >= 15。然后我尝试过滤具有最小时间值的行这个新的数据框。问题是 min(time) 返回所有文件的全局最小值，而我想确定每个文件中的最低时间值。任何帮助将不胜感激！

df <- structure(list(file = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L), .Label = c("19509002.abf", "19509007.abf"
), class = "factor"), time = c(4.800000191, 4.849999905, 4.900000095, 
4.949999809, 5, 5.050000191, 5.099999905, 5.150000095, 5.199999809, 
5.25, 5.300000191, 5.349999905, 5.400000095, 5.449999809, 5.5, 
4.849999905, 4.900000095, 4.949999809, 5, 5.050000191, 5.099999905, 
5.150000095, 5.199999809, 5.25, 5.300000191, 5.349999905, 5.400000095, 
5.449999809), V = c(-34.8815918, -29.96826172, -23.65112305, 
-16.44897461, -7.843017578, 3.234863281, 15.86914063, 27.6184082, 
37.109375, 44.18945313, 49.37744141, 52.94799805, 55.41992188, 
57.00683594, 57.80029297, -36.28540039, -31.92138672, -24.78027344, 
-16.3269043, -6.683349609, 5.310058594, 18.89038086, 31.21948242, 
40.67993164, 47.24121094, 51.63574219, 54.32128906, 55.9387207
), dvdt = c(47.6074219, 98.2666016, 126.342773, 144.042969, 172.119141, 
221.557617, 252.685547, 234.985352, 189.819336, 141.601563, 103.759766, 
71.4111328, 49.4384766, 31.7382813, 15.8691406, 27.4658203, 87.2802734, 
142.822266, 169.067383, 192.871094, 239.868164, 271.606445, 246.582031, 
189.208984, 131.225586, 87.890625, 53.7109375, 32.3486328)), row.names = c(NA, 
28L), class = "data.frame")

vthresh <- data.frame()
for (i in unique(df$file)){
  vthresh = rbind(vthresh, df %>% filter(file == i, time == min(time)))
}

【问题讨论】：

请提供样本数据，使用dput 或data.frame。参考：stackoverflow.com/q/5963269、minimal reproducible example 和 stackoverflow.com/tags/r/info。
没有任何样本数据很难判断，但可能类似于spikes %>% group_by(file) %>% filter(dvdt >= 15) %>% slice_min(time)
感谢您的反馈，我提供了一个最小示例的示例数据！

标签： r

【解决方案1】：

# filter dvdt values >= 15
dfsub <- subset(df, dvdt >= 15)

# identify the lowest time value within each file
aggregate(dfsub$time, by = list(dfsub$file), min)

给出以下输出：

       Group.1    x
1 19509002.abf 4.80
2 19509007.abf 4.85

【讨论】：

谢谢，效果很好 - 如果我想输出其他行信息（比如 V），我该怎么做？
一种优雅的添加其他行信息的方法是使用merge，如下：df_sub <- subset(df, dvdt >= 15) df_agg <- aggregate(df_sub$time, by = list(df_sub$file), min) colnames(df_agg) <- c('file', 'time') merge(df_agg, df_sub)
它为您提供以下输出：``` 文件时间 V dvdt 1 19509002.abf 4.80 -34.88159 47.60742 2 19509007.abf 4.85 -36.28540 27.46582```
抱歉，我的声望太低，无法记录我的点赞
我明白了。我以为您可以将我的答案标记为已接受（绿色勾号），但我在这里也很新，所以也许不是一个选择。无论如何，很高兴我能帮忙:)

【解决方案2】：

我以 Marc 的回答为基础，下面的代码有效！

df_sub <- subset(df, dvdt >= 15)
df_agg <- aggregate(df_sub$time, by = list(df_sub$file), min) 
colnames(df_agg) <- c('file', 'time')
vthresh <- merge(df_sub, df_agg, by=c("file","time"))

【讨论】：