【发布时间】:2018-06-29 15:22:20
【问题描述】:
我想根据其他列的异同删除重复项。
应完全删除所有重复的 ID,但前提是它们具有不同的颜色。它们是否也有不同的子组也没关系。如果它们具有相同的 ID 和相同的颜色,则应保留第一个。
最后,我想要一个所有 ID 的列表,这些 ID 都是单色的(独立于子组)。应删除所有多色 ID。
这里和例子:
id colour subgroup
1 1 red lightred
2 2 blue lightblue
3 2 blue darkblue
4 3 red lightred
5 4 red darkred
6 4 red darkred
7 4 blue lightblue
8 5 green darkgreen
9 5 green darkgreen
10 5 green lightgreen
11 6 red darkred
12 6 blue darkblue
13 6 green lightgreen
最后应该是这样的:
id colour subgroup
1 1 red lightred
2 2 blue lightblue
4 3 red lightred
8 5 green darkgreen
我在这个例子中使用的数据:
id = c(1,2,2,3,4,4,4,5,5,5,6,6,6)
colour = c("red","blue","blue","red","red","red","blue","green","green","green","red","blue","green")
subgroup = c("lightred","lightblue","darkblue","lightred","darkred","darkred","lightblue","darkgreen","darkgreen","lightgreen","darkred","darkblue","lightgreen")
data = data.frame(cbind(id,colour,subgroup))
感谢您的帮助!
【问题讨论】:
标签: r duplicates