【发布时间】:2014-05-07 14:57:36
【问题描述】:
我有一个像这样的数据框:
df <- structure(list(year = c(1990, 1990, 1990, 1990, 1990, 1990, 1990,
1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1991, 1991, 1991,
1991, 1991, 1991, 1991, 1991, 1991, 1991, 1991, 1991, 1991, 1991,
1991), group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"),
value = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
13L, 14L, 15L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L)), .Names = c("year", "group", "value"
), row.names = c(NA, -30L), class = "data.frame")
> df
year group value
1 1990 A 1
2 1990 A 2
3 1990 A 3
4 1990 A 4
5 1990 A 5
6 1990 A 6
7 1990 B 7
8 1990 B 8
9 1990 B 9
10 1990 B 10
11 1990 B 11
12 1990 B 12
13 1990 B 13
14 1990 B 14
15 1990 B 15
16 1991 A 5
17 1991 A 6
18 1991 A 7
19 1991 A 8
20 1991 A 9
21 1991 A 10
22 1991 A 11
23 1991 A 12
24 1991 A 13
25 1991 A 14
26 1991 B 15
27 1991 B 16
28 1991 B 17
29 1991 B 18
30 1991 B 19
我需要为每一年应用一个函数(我打算使用 plyr 和 summarise 执行此操作),但仅在具有最多行(A 或 B)的因子级别上。有没有办法自动选择每年的这个级别(A 或 B)?
df2 <- ddply(df, .(year), summarise, result="some operation on longest level"))
想要的输出:
> df2
year group value result
1 1990 B 7 5
2 1990 B 8 4
3 1990 B 9 5
4 1990 B 10 3
5 1990 B 11 3
6 1990 B 12 8
7 1990 B 13 11
8 1990 B 14 7
9 1990 B 15 2
10 1991 A 5 10
11 1991 A 6 13
12 1991 A 7 9
13 1991 A 8 7
14 1991 A 9 6
15 1991 A 10 1
16 1991 A 11 15
17 1991 A 12 5
18 1991 A 13 5
19 1991 A 14 2
【问题讨论】:
-
您可以使用
table开始。例如。lapply(split(df, df$year), function(x) table(x$group))
标签: r