【发布时间】:2020-08-07 00:51:39
【问题描述】:
我在 R 中有一个大型数据框,其中用户的任务是描述场景中的对象。每个场景我需要唯一的 3 个用户,但是有些场景被描述了 3 次以上。我正在尝试保留前 3 个唯一用户并删除其余用户。
玩具数据(真实数据集有更多的行和列)
user <- c("A", "A", "A", "B", "B", "C", "C", "D", "E", "E", "F", "F", "F")
scene <- c("library", "library", "library", "park", "park", "library", "library", "park", "library", "library", "library", "library", "library")
object <- c("book", "book", "lamp", "dog", "cat", "book", "lamp", "dog", "desk", "desk", "book", "lamp", "lamp")
index <- c(1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2)
dat <- data.frame(user, scene, object, index)
user scene object index
A library book 1
A library book 2
A library lamp 1
B park dog 1
B park cat 1
C library book 1
C library lamp 1
D park dog 1
E library desk 1
E library desk 2
F library book 1
F library lamp 1
F library lamp 2
... ... ... ...
例如,这里A、B 和C 是最早描述场景library 的用户。所以现在不需要F 的描述。我的主要问题是,虽然我可以获得唯一用户的总数,但我不知道如何将它们标记为 1、2、3 等,以便截断超过 3 的值。
期望的输出
user scene object index count
A library book 1 1
A library book 2 1
A library lamp 1 1
B park dog 1 1
B park cat 1 1
C library book 1 2
C library lamp 1 2
D park dog 1 2
E library desk 1 3
E library desk 2 3
这很有帮助,但只能按一列分组,所以我无法在此处应用它:R - Group by variable and then assign a unique ID
【问题讨论】:
标签: r dataframe data-wrangling