删除R中仅相差一列的特定行[重复]答案

【问题标题】：Delete specific rows in R which only differ by one column [duplicate]删除R中仅相差一列的特定行[重复]
【发布时间】：2020-12-01 02:07:13
【问题描述】：

我创建了一个数据框，每个国家和年份应该有一个观察值。我有一个二进制变量（V1），它同时提供（0 和 1）一年和国家。结果如下表：

ID  
132     1 58 15 15 2014    Australia       Yes
133     0 58 15 15 2014    Australia       Yes
134     0 58 15 15 2015    Australia       Yes
135     0 58 15 15 2016    Australia       Yes
136     0 58 15 15 2017    Australia       Yes
137     1 58 15 15 2017    Australia       Yes
138     1 58 15 15 2018    Australia       Yes
139     0 58 15 15 2018    Australia       Yes
140     0 58 15 15 2019    Australia       Yes
141     0 57 15 15 2020    Australia       Yes

对于同一年份和同一国家/地区的多个观测值，我只想保留那些保持 V1 变量值为 1 的观测值。

【问题讨论】：

df[df$V1 == 1,]? （如果 df 是您的数据集）
感谢您的快速回复，但如果同一年的观测值不是 1，我想保持观测值 0。
然后运行df[!duplicated(df[, c("Country", "Year")]), ]，你甚至不需要首先创建V1

标签： r dataframe delete-row

【解决方案1】：

对于每个country 和year，如果组中的行数为1 或V1 == 1，您可以选择行。

library(dplyr)
df %>% group_by(country, year) %>% filter(n() == 1 | V1 == 1)

data.table 中的等价物是：

library(data.table)
setDT(df)[, .SD[.N == 1 | V1 == 1], .(country, year)]

【讨论】：