根据 id 和其他变量对案例进行分组

【问题标题】：Group cases based on id and other variables根据 id 和其他变量对案例进行分组
【发布时间】：2020-08-28 11:18:21
【问题描述】：

我无法根据 ifelse 条件限制数据集。

这是我的数据框的一个示例：

structure(list(id = c(111, 111, 111, 112, 112, 112), se = c(1, 
2, 3, 1, 2, 3), pe = c(1, 1, 2, 1, 1, 1)), class = "data.frame", row.names = c(NA, 
-6L))

我需要选择id和pe相同的case

结束表应该是这样的：

  id     se   pe
    112     1    1  
    112     2    1    
    112     3    1

【问题讨论】：

一个选项：subset(df, id == 112 & pe == 1)
@Humpelstielzchen 谢谢，是的，这是一个很好的解决方案，但不幸的是如果有 211 个 id

标签： r dataframe

【解决方案1】：

我建议使用dplyr 的下一个方法。您可以计算标志以确定唯一元素的数量，然后进行过滤。标志是nid 和npe。这里的代码带有df 你的dput() 数据：

library(dplyr)
#Code
df %>% group_by(id) %>% mutate(nid = n_distinct(id),npe = n_distinct(pe)) %>%
  filter(nid==1 & npe==1) %>% select(-c(nid,npe))

输出：

# A tibble: 3 x 3
# Groups:   id [1]
     id    se    pe
  <dbl> <dbl> <dbl>
1   112     1     1
2   112     2     1
3   112     3     1

【讨论】：

【解决方案2】：

我们也可以在不创建/删除新列的情况下这样做

library(dplyr)
df1 %>% 
   group_by(id) %>% 
   filter(n_distinct(se) == 1 | n_distinct(pe) == 1)
# A tibble: 3 x 3
# Groups:   id [1]
#     id    se    pe
#  <dbl> <dbl> <dbl>
#1   112     1     1
#2   112     2     1
#3   112     3     1

【讨论】：