【问题标题】:Keep only duplicated entries with condition [duplicate]仅保留具有条件的重复条目 [重复]
【发布时间】:2019-03-28 07:15:15
【问题描述】:

我正在清理数据集,我只需要保留那些重复 4 次的数据集(如“a”和“b”),但是,我无法做到这一点。有人可以帮忙吗?

谢谢!

let <- c("a","a","a","a","b","b","b","b","c","c","c","d","d","e")
avg <- c(1,1,1,2,3,4,5,6,1,2,3,4,3,5)

sample <- data.frame(let,avg)

【问题讨论】:

  • sample %&gt;% group_by(let) %&gt;% filter(n() &gt;= 4) dplyr

标签: r


【解决方案1】:

我们可以使用data.table

library(data.table)
setDT(sample)[, .SD[.N >=4], let]
#   let avg
#1:   a   1
#2:   a   1
#3:   a   1
#4:   a   2
#5:   b   3
#6:   b   4
#7:   b   5
#8:   b   6

或与base R 一起使用ave

sample[with(sample, ave(avg, let, FUN = length)>=4),]

或者table

subset(sample, let %in% names(which(rowSums(table(sample)) >=4)))

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2020-10-21
    • 2012-09-30
    • 2021-06-25
    • 1970-01-01
    • 1970-01-01
    • 2022-01-17
    • 1970-01-01
    相关资源
    最近更新 更多