【发布时间】:2019-10-02 16:26:14
【问题描述】:
我有一个嵌套的df,我正在尝试清理它。
Sample Data:
df <-
tibble::tribble(
~idTeam, ~ptsTotalBehindFirst, ~ptsOverall, ~ptsDiffLastPeriod, ~rankOverall, ~ptsBattingBehindFirst, ~ptsBatting, ~ptsDiffBattingLastPeriod, ~dataBatting, ~rankBatting, ~ptsPitchingBehindFirst, ~ptsPitching, ~ptsDiffPitchingLastPeriod, ~dataPitching, ~rankPitching,
"2", "0", "111", "-4", 1L, "0", "65", "0", list(abbr = c("OBP", "HR", "RBI", "R", "SB"), roto_points = c(13, 13, 13, 13, 13), value = c(0.3663, 384, 1012, 1102, 164), diff = c(0, 0, 0, 0, 0), rank = c(1, 1, 1, 1, 1)), 1L, "5", "46", "-4", list(abbr = c("S", "W", "K", "ERA", "WHIP"), roto_points = c(12, 6, 11, 8, 9), value = c(94, 89, 1576, 3.946, 1.2179), diff = c(0, -2, -2, 0, 0), rank = c(2, 8, 3, 6, 5)), 3L,
"8", "13.5", "97.5", "2", 2L, "13", "52", "0", list(abbr = c("OBP", "HR", "RBI", "R", "SB"), roto_points = c(12, 11, 11, 12, 6), value = c(0.3576, 323, 954, 1011, 89), diff = c(0, 0, 0, 0, 0), rank = c(2, 3, 3, 2, 8)), 3L, "5.5", "45.5", "2", list(abbr = c("S", "W", "K", "ERA", "WHIP"), roto_points = c(2, 7.5, 10, "13", 13), value = c(56, 91, 1508, 3.688, 1.1474), diff = c(-1, 1.5, 0.5, 1, 0), rank = c(12, 6, 4, 1, 1)), 4L
)
我试图unnest 的数据存储在dataBatting 和dataPitching 列中。我正在尝试unnest 两列中的所有列并将结果绑定为行。类似于pivot_longer 的东西,但我不确定将 4 个重复的列嵌套在 2 个单独的列中的正确方法。
我的尝试是:
df %>%
unnest_wider(dataBatting) %>%
unnest(c(abbr, roto_points, value, diff, rank)) %>%
unnest_wider(dataPitching) %>%
unnest(c(abbr, roto_points, value, diff, rank))
Error is:
Error: Column names `abbr`, `roto_points`, `value`, `diff`, `rank` must not be duplicated.
Use .name_repair to specify repair.
Call `rlang::last_error()` to see a backtrace
我的问题是我想绑定 dataPitching 中与 dataBatting 具有相同列名的相同列(abbr、roto_points、value、diff、rank)。
我还想更改重复列的名称。 tidyr::hoist 是更好的方法吗?
想要的df:
tibble::tribble(
~idTeam, ~ptsTotalBehindFirst, ~ptsOverall, ~ptsDiffLastPeriod, ~rankOverall, ~ptsBattingBehindFirst, ~ptsBatting, ~ptsDiffBattingLastPeriod, ~abbr, ~roto_points5, ~value, ~diff, ~rank, ~rankPitching, ~ptsPitchingBehindFirst, ~ptsPitching, ~ptsDiffPitchingLastPeriod,
2, 0, 111, -4, 1, 0, 65, 0, "OBP", 13, 0.3663, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "HR", 13, 384, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "RBI", 13, 1012, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "R", 13, 1102, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "SB", 13, 164, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "S", 12, 94, 0, 2, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "W", 6, 89, -2, 8, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "K", 11, 1576, -2, 3, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "ERA", 8, 3.946, 0, 6, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "WHIP", 9, 1.2179, 0, 5, 3, 5, 46, -4,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "OBP", 12, 0.3576, 0, 2, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "HR", 11, 323, 0, 3, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "RBI", 11, 954, 0, 3, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "R", 12, 1011, 0, 2, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "SB", 6, 89, 0, 8, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "S", 2, 56, -1, 12, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "W", 7.5, 91, 1.5, 6, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "K", 10, 1508, 0.5, 4, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "ERA", 13, 3.688, 1, 1, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "WHIP", 13, 1.1474, 0, 1, 4, 5.5, 45.5, 2
)
【问题讨论】:
-
无法通过您展示的示例重现错误。
df %>% unnest(c(dataBatting, dataPitching))# # A tibble: 10 x 15 -
修复了这个问题,因为我意识到我的例子不清楚。
-
它仍然可以正常工作
df %>% + unnest_wider(dataBatting) %>% unnest(c(abbr, roto_points, value, diff, rank)) # A tibble: 10 x 19没有收到任何错误 -
我添加了更多信息以使其更清晰。对此感到抱歉。
-
这个问题也和no common type有关