根据内容删除一行答案

【问题标题】：Removing a row based on its content根据内容删除一行
【发布时间】：2019-07-16 13:19:39
【问题描述】：

我正在尝试编写一个代码来删除基于其内容的观察。这个想法是，每个观察都必须在初始/预初始之后进行审查。

我的数据框如下所示：

ID  Type            Registered
P40 Pre-Initial     Yes
P40 Review  
P40 Review  
P42 Initial         Yes
P43 Initial         Yes
P43 Review  
P44 Pre-Initial     Yes
P44 Review

我的输入代码：

tt <- structure(list(ID = c("P40", "P40", "P40", "P42", "P43", "P43",
                            "P44", "P44"),Type = c("Pre-Initial", "Review", "Review", "Initial", "Initial", "Review", "Pre-Initial", "Review"),
                     Registered = c("Yes", "", "", "Yes", "Yes", "", "Yes", "")),
                class = "data.frame", row.names = c(NA, -8L))

我想要达到的目的：

ID  Type            Registered
P40 Pre-Initial     Yes
P40 Review  
P40 Review  
P43 Initial         Yes
P43 Review  
P44 Pre-Initial     Yes
P44 Review

这是我迄今为止尝试过的代码，但它不起作用。

 tt %>% group_by(ID) %>%
    slice(which(Registered == "Yes" & any(Type != "Review")))
)

【问题讨论】：

您似乎有多个相关问题，如果您更改数据当前的格式，这些问题将大大简化。根本问题是您的数据有空单元格，因为 Registered 值不是为Reviews 录制。它是否正确？如果是这样，您选择的表格不是表示数据的好方法。我建议您阅读 Karl Broman 关于 how to structure tabular data 的建议。它可能会使解决您的问题变得更加简单。
试试tt %>% group_by(ID) %>% filter(n() > 1 & any(Type == 'Review'))
@Sotos，感谢成功

标签： r dplyr tidyverse

【解决方案1】：

一种方法是简单地保留超过 1 行的组和条目Review，即

library(dplyr)

tt %>% 
 group_by(ID) %>% 
 filter(n() > 1 & any(Type == 'Review'))

【讨论】：

【解决方案2】：

在 Initial/Pre-Initial 之后必须对每个观察结果进行审查。

获取Type == "Review" 所在的所有索引并提取它的最后一个索引并将其与c("Pre-Initial", "Initial") 的索引进行比较，如果索引的any 更大，则选择组。

library(dplyr)

tt %>%
  group_by(ID) %>%
  filter(any(tail(which(Type == "Review"), 1) > 
             which(Type %in% c("Pre-Initial", "Initial"))))

#  ID    Type        Registered
#  <chr> <chr>       <chr>     
#1 P40   Pre-Initial Yes       
#2 P40   Review      ""        
#3 P40   Review      ""        
#4 P43   Initial     Yes       
#5 P43   Review      ""        
#6 P44   Pre-Initial Yes       
#7 P44   Review      ""

【讨论】：