【发布时间】:2019-11-01 20:55:31
【问题描述】:
我正在尝试根据 R 中的某些类别数据进行列表拆分。
我有这些数据:
# A tibble: 5 x 2
category to_split
<chr> <chr>
1 cat12 c(1, 5)
2 cat22 c(2, 5, 1)
3 cat33 3
4 cat43 4
5 cat51 c(5, 2)
其中c(1, 5) 对应此数据中的第 1 行和第 5 行。 c(2, 5, 1)对应第2行、第5行和第1行。
我还有第二个更大的数据框,如下所示:
# A tibble: 100 x 4
# Groups: station_location [5]
category var1 var2 var3
<chr> <dbl> <dbl> <dbl>
1 cat12 7 0.4 10
2 cat12 20 1.1 155
3 cat12 12 0.4 3
4 cat12 4 0.3 38
5 cat12 13 0.4 40
6 cat12 7 0.3 17
7 cat12 9 0.4 45
8 cat12 3 0.3 17
9 cat12 8 0.5 84
10 cat12 32 2.6 378
# ... with 90 more rows
我想创建新列表,以便 c(1, 5) 从大型数据框中提取 cat12 和 cat51 数据。此外,c(2, 5, 1) 提取 cat22、cat51 和 cat12 数据并将这些数据存储在数据框中(在列表中)。
我想得到如下的列表结构:
list(
c(1, 5) - a data frame containing the two corresponding categories of data
c(2, 5, 1) - a data frame contained the three corresponding categories of data
3
4
c(5, 2)
)
(我不关心列表名称是c(1, 5)... 等)。我将把它们命名为数据来自哪个类别(在小数据框中),即
list(
cat12 - data frame containing the two corresponding categories
cat22
cat33
cat43
cat51
)
我正在尝试通过较小数据帧中的映射从较大数据帧中提取相关数据。
数据:
data_join <- structure(list(category = c("cat12", "cat22", "cat33", "cat43",
"cat51"), to_split = c("c(1, 5)", "c(2, 5, 1)", "3", "4", "c(5, 2)"
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-5L))
full_data <- structure(list(category = c("cat12", "cat12", "cat12", "cat12",
"cat12", "cat12", "cat12", "cat12", "cat12", "cat12", "cat12",
"cat12", "cat12", "cat12", "cat12", "cat12", "cat12", "cat12",
"cat12", "cat12", "cat22", "cat22", "cat22", "cat22", "cat22",
"cat22", "cat22", "cat22", "cat22", "cat22", "cat22", "cat22",
"cat22", "cat22", "cat22", "cat22", "cat22", "cat22", "cat22",
"cat22", "cat33", "cat33", "cat33", "cat33", "cat33", "cat33",
"cat33", "cat33", "cat33", "cat33", "cat33", "cat33", "cat33",
"cat33", "cat33", "cat33", "cat33", "cat33", "cat33", "cat33",
"cat43", "cat43", "cat43", "cat43", "cat43", "cat43", "cat43",
"cat43", "cat43", "cat43", "cat43", "cat43", "cat43", "cat43",
"cat43", "cat43", "cat43", "cat43", "cat43", "cat43", "cat51",
"cat51", "cat51", "cat51", "cat51", "cat51", "cat51", "cat51",
"cat51", "cat51", "cat51", "cat51", "cat51", "cat51", "cat51",
"cat51", "cat51", "cat51", "cat51", "cat51"), var1 = c(7, 20,
12, 4, 13, 7, 9, 3, 8, 32, 5, 2, 14, 7, 11, 9, 25, 5, 6, 18,
14, 12, 11, 11, 5, 7, 12, 2, 7, 7, 5, 28, 6, 8, 4, 9, 4, 11,
6, 5, NA, NA, 24, 6, 6, 29, NA, 11, NA, NA, NA, 9, NA, 8, 7,
NA, 17, 6, NA, 6, NA, NA, NA, NA, NA, NA, NA, NA, 13, NA, NA,
NA, NA, 16, 7, 8, NA, NA, 10, 19, 6, 10, 3, 12, 2, 2, 7, 11,
5, 5, 6, 3, 6, 9, 11, 11, 12, 5, 14, 5), var2 = c(0.4, 1.1, 0.4,
0.3, 0.4, 0.3, 0.4, 0.3, 0.5, 2.6, 0.6, 0.3, 0.5, 0.4, 0.4, 0.7,
0.5, 0.3, 0.4, 0.6, 0.5, 0.3, 0.4, 0.2, 0.4, 0.5, 0.5, 0.3, 0.4,
0.3, 0.4, 1.1, 0.4, 0.5, 0.2, 0.5, 0.4, 0.5, 0.6, 0.6, NA, NA,
0.7, 0.1, 0.3, 0.5, NA, 0.7, NA, NA, NA, 0.2, NA, 0.3, 0.2, NA,
0.3, 0.3, NA, 0.1, 0.2, 0.2, 0.5, 0.4, 0.3, 0.4, 0.2, 0.4, 0.3,
0.3, 0.2, 0.3, 0.2, 0.4, 0.2, 0.2, 0.3, 0.3, 0.5, 0.5, 0.4, 0.2,
0.3, 0.7, 0.3, 0.1, 0.3, 0.3, 0.4, 0.6, 0.3, 0.2, 0.4, 0.6, 0.2,
0.7, 0.6, 0.4, 0.6, 0.5), var3 = c(10, 155, 3, 38, 40, 17, 45,
17, 84, 378, 44, 14, 36, 20, 17, 76, 25, 4, 22, 63, 42, 23, 12,
10, 15, 29, 26, 7, 18, 5, 23, 204, 24, 56, 7, 35, 23, 55, 28,
65, 10, 13, 54, 13, 22, 45, 29, 58, 49, 14, 2, 9, 15, 38, 41,
63, 11, 9, 7, 20, 3, 5, 52, 7, 18, 25, 2, 30, 10, 3, 3, 13, 1,
12, 7, 5, 5, 9, 13, 4, 14, 9, 8, 147, 5, 7, 2, 10, 6, 66, 2,
8, 6, 3, 8, 5, 45, 6, 20, 27)), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), row.names = c(NA, -100L), groups = structure(list(
station_location = c("cat12", "cat22", "cat33", "cat43",
"cat51"), .rows = list(1:20, 21:40, 41:60, 61:80, 81:100)), row.names = c(NA,
-5L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))
【问题讨论】:
标签: r