【问题标题】:R dplyr group by one column and filter rows according another two columnsR dplyr 按一列分组并根据另外两列过滤行
【发布时间】:2019-10-16 16:05:43
【问题描述】:

这是我的数据框的示例。我想要按 A 列(colA)分组的结果,然后过滤只有同时具有以下 4 种类型值的单词的行(“colB == 1 & colC == 1”、“colB == 2 & colC == 2","colB == 1 & colC == 2","colB == 2 & colC == 1") 被选中。我怀疑这将涉及 AND 和 OR 条件的组合使用,但我不知道该怎么做。

colA  colB colC
become  2   1
become  2   1
become  2   1
borrow  1   2
break   1   2
break   1   1
bridge  1   1
build   1   2
buy     1   2
buy     2   2
buy     2   1
buy     1   1
buy     1   1

因此,在上面的示例中,仅选择了“购买”行。输出应该是这样的:

colA  colB colC
buy     1   2
buy     2   2
buy     2   1
buy     1   1
buy     1   1

【问题讨论】:

    标签: r dplyr


    【解决方案1】:

    在按'colA'、filter 分组后,通过检查all vector (c('11', ...., '21')) 中的元素是否存在%in%pasted 'colB'、'colC '

    library(dplyr)
    library(stringr)
    df1 %>% 
       group_by(colA) %>% 
       filter( all(c('11', '22', '12', '21') %in% str_c(colB, colC)))
    # A tibble: 5 x 3
    # Groups:   colA [1]
    #  colA   colB  colC
    #  <chr> <int> <int>
    #1 buy       1     2
    #2 buy       2     2
    #3 buy       2     1
    #4 buy       1     1
    #5 buy       1     1
    

    数据

    df1 <- structure(list(colA = c("become", "become", "become", "borrow", 
    "break", "break", "bridge", "build", "buy", "buy", "buy", "buy", 
    "buy"), colB = c(2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
    1L, 1L), colC = c(1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 
    1L, 1L)), class = "data.frame", row.names = c(NA, -13L))
    

    【讨论】:

      猜你喜欢
      • 2018-06-02
      • 2019-08-02
      • 2017-05-04
      • 1970-01-01
      • 1970-01-01
      • 2015-10-26
      • 2016-02-05
      • 1970-01-01
      • 2021-11-11
      相关资源
      最近更新 更多