如何用子列表重塑列表？答案

【问题标题】：How to reshape list with sublist?如何用子列表重塑列表？
【发布时间】：2021-12-28 16:03:56
【问题描述】：

> mylist
$result.1
  truth model1 model2
1     1      2    1.0
2     2      3   -0.5
3     3     -1    4.0

$result.2
  truth model1 model2
1     1      1      2
2     2      4      2
3     3      4      1

我有一个包含许多子列表的列表。上例中为2，但子列表的数量可以多于2。

在每个子列表中，都有一个 data.frame，其中包含 truth 以及来自 model1 和 model2 的预测。我想重塑我的列表，使每个子列表对应一个特定的模型，即，我想：

$model1
  truth result.1 result.2
1     1        2        1
2     2        3        4
3     3       -1        4

$model2
  truth result.1 result.2
1     1      1.0        2
2     2     -0.5        2
3     3      4.0        1

有没有一种快速的方法来以这种方式重塑列表？

【问题讨论】：

子列表中是否总是有两个模型？

标签： r list

【解决方案1】：

在do.call 中使用cbind。

lapply(1:length(L), \(i) do.call(cbind, c(L$result.1[, 1, F], lapply(L, `[[`, i)))) |>
  setNames(names(L))
# $model1
#      truth result.1 result.2
# [1,]     1        2        1
# [2,]     2        3        4
# [3,]     3       -1        4
# 
# $model2
#      truth result.1 result.2
# [1,]     1      1.0        2
# [2,]     2     -0.5        2
# [3,]     3      4.0        1

注意： R >= 4.1

数据：

L <- list(result.1 = structure(list(truth = 1:3, model1 = c(2, 3, 
-1), model2 = c(1, -0.5, 4)), class = "data.frame", row.names = c(NA, 
-3L)), result.2 = structure(list(truth = 1:3, model1 = c(1, 4, 
4), model2 = c(2, 2, 1)), class = "data.frame", row.names = c(NA, 
-3L)))

【讨论】：

@Parfait 我不确定，OP 声明 model1 和 model2，而未知数量的子列表是一个问题，但现在它应该可以很好地扩展。

【解决方案2】：

这是一个 tidyverse 选项：如果将所有数据帧绑定到一个并使用列表的名称来标记数据来自哪个模型，它就变成了一个简单的转置操作。然后你可以再次按型号拆分。

我添加了一个额外的模型来测试它的扩展方式：您不需要硬编码试验或模型的数量或它们的名称，如果一个模型在一次试验中丢失，您将有 NA，但不会出现错误。

library(dplyr)

mylist %>%
  bind_rows(.id = "trial") %>%
  tidyr::pivot_longer(matches("model\\d+"), names_to = "model") %>%
  tidyr::pivot_wider(names_from = trial) %>%
  split(.$model) %>%
  purrr::map(select, -model)
#> $model1
#> # A tibble: 3 × 3
#>   truth result.1 result.2
#>   <int>    <dbl>    <dbl>
#> 1     1        2        1
#> 2     2        3        4
#> 3     3       -1        4
#> 
#> $model2
#> # A tibble: 3 × 3
#>   truth result.1 result.2
#>   <int>    <dbl>    <dbl>
#> 1     1      1          2
#> 2     2     -0.5        2
#> 3     3      4          1
#> 
#> $model3
#> # A tibble: 3 × 3
#>   truth result.1 result.2
#>   <int>    <dbl>    <dbl>
#> 1     1        0        9
#> 2     2        4        5
#> 3     3        2        2

来自 jay.sf 答案的数据加上另一个虚拟列

mylist <- list(result.1 = structure(list(truth = 1:3, model1 = c(2, 3, -1), model2 = c(1, -0.5, 4), model3 = c(0, 4, 2)), class = "data.frame", row.names = c(NA, -3L)), result.2 = structure(list(truth = 1:3, model1 = c(1, 4, 4), model2 = c(2, 2, 1), model3 = c(9, 5, 2)), class = "data.frame", row.names = c(NA, -3L)))

【讨论】：

【解决方案3】：

考虑使用链合并遍历不同的模型列名称：

newlist <- sapply(
  names(mylist$result.1)[-1],
  function(nm) {
    df <- Reduce(
      function(x, y) merge(x, y, by="truth"), 
      lapply(mylist, `[`, c("truth", nm))
    )
    
    df <- setNames(df, c("truth", paste0("result.", 1:(ncol(df)-1))))
  },
  simplify = FALSE
)

newlist
$model1
  truth result.1 result.2
1     1        2        1
2     2        3        4
3     3       -1        4

$model2
  truth result.1 result.2
1     1      1.0        2
2     2     -0.5        2
3     3      4.0        1

【讨论】：