【发布时间】:2019-07-12 07:14:21
【问题描述】:
我有多个级别的因子 race 和 group 的数据框,下面的最小示例:
id race group
1 1 White 1
2 2 White 1
3 3 White 1
4 4 White 1
5 5 White 1
6 6 White 2
7 7 White 2
8 8 White 2
9 9 White 2
10 10 Black 1
11 11 Black 1
12 12 Black 1
13 13 Black 2
14 14 Black 2
15 15 Black 2
16 16 Black 2
17 17 Hispanic 1
18 18 Hispanic 1
19 19 Hispanic 1
20 20 Hispanic 1
21 21 Hispanic 1
22 22 Hispanic 2
23 23 Hispanic 2
24 24 Hispanic 2
25 25 Hispanic 2
我可以使用"White" 对每个race 级别分组的单个数据框进行子集化,然后使用以下函数将数据按group 拆分。
filter.race <- function(x, y) { f <- subset(x, race == "White" | race == y)
f <- split(f, f$group)
f}
返回:
filter.race(df, "Black")
$`1`
id race group
1 1 White 1
2 2 White 1
3 3 White 1
4 4 White 1
5 5 White 1
10 10 Black 1
11 11 Black 1
12 12 Black 1
$`2`
id race group
6 6 White 2
7 7 White 2
8 8 White 2
9 9 White 2
13 13 Black 2
14 14 Black 2
15 15 Black 2
16 16 Black 2
filter.race(df, "Hispanic")
$`1`
id race group
1 1 White 1
2 2 White 1
3 3 White 1
4 4 White 1
5 5 White 1
17 17 Hispanic 1
18 18 Hispanic 1
19 19 Hispanic 1
20 20 Hispanic 1
21 21 Hispanic 1
$`2`
id race group
6 6 White 2
7 7 White 2
8 8 White 2
9 9 White 2
22 22 Hispanic 2
23 23 Hispanic 2
24 24 Hispanic 2
25 25 Hispanic 2
但是,我正在尝试找到一种方法将此函数应用于数据帧的所有级别,而不是多次单独指定 y。
样本数据:
dput(df)
structure(list(id = 1:25, race = structure(c(3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L), .Label = c("Black", "Hispanic", "White"), class = "factor"),
group = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L)), .Names = c("id",
"race", "group"), class = "data.frame", row.names = c(NA, -25L
))
【问题讨论】:
-
lapply(levels(df$race), filter.race, x=df)
标签: r