cbind 数据框列表和列表向量答案

【问题标题】：cbind a list of dataframe and to a vector of listcbind 数据框列表和列表向量
【发布时间】：2018-07-25 11:14:41
【问题描述】：

假设我有一个名为 list_df_A 的带有 data.frame 的嵌套列表，其结构如下：

$ :'data.frame':      1 obs. of 3 variables:
  ..$ a             :chr a1
  ..$ b             :chr b1
  ..$ c             :chr c1

$ :'data.frame':      3 obs. of 3 variables:
  ..$ a             :chr [1:3] a21 a22 a23
  ..$ b             :chr [1:3] b21 b22 b23
  ..$ c             :chr [1:3] c21 c22 c23

$ :'data.frame':      1 obs. of 3 variables:
  ..$ a             :chr a3
  ..$ b             :chr b3
  ..$ c             :chr c3

所以如果我将它们绑定到 data.table/data.frame 中：

list_df_A <- rbindlist(list_df_A)

list_df_A 将如下所示：

      a     b     c
1:   a1    a2    a3
2:  a21   b21   c21
3:  a22   b22   c22
4:  a23   b23   c23
5:   a3    b3    c3

现在，我有另一个清单。这个列表实际上是一个 json 文件的根目录。让我称这个列表为 list_root，它具有以下结构：

chr [1:3] "type1" "type2" "type3"

如果我把它做成 data.table/data.frame:

list_root <- as.data.table(list_root)

我得到了这张桌子

       V1
1:  type1
2:  type2
3:  type3

现在问题来了：我知道list_root中的type2在list_df_A中有3条记录。这是因为每个“类型”指的是 list_df_A

中的一个数据帧

你怎么告诉 R 当它把两个 data.table 绑定在一起时，它会显示这样的东西？

           V1       a     b     c
     1: type1      a1    a2    a3
     2: type2     a21   a21   a21
     3: type2     b22   b22   b22
     4: type2     c23   c23   c23
     5: type3      a3    b3    c3

从某种意义上说，第2,3,4行属于type2？

【问题讨论】：

没有rbindlist 有一个idcol 参数来关闭您传递的列表的名称吗？就为了这个目的？
...如果您所做的只是行绑定，为什么还要继续引用 cbind？
当我提到 cbind 我想提到列绑定。像最后一个数据框。您可以看到最后一个数据框中的第一列来自 list_root，倒数第二列是 list_df_A。
啊。好吧，那我就把你开始的那个列表命名为列表，然后在rbindlist 中使用idcol，你应该已经设置好了。
当你提到使用idcol的rbindlist时，我想到了一个主意。对于 list_root，我使用 rbindlist(list_root, use.names= TRUE, fill=TRUE, idcol=TRUE)。然后对于 list_df_A，我使用 rbindlist(list_df_A, use.names= TRUE, fill=TRUE, idcol=TRUE)。现在两个数据框都有一个 .id 列。然后我可以使用例如 data.frame 通过 .id 将它们合并在一起。像 merge(list_df_A, list_root, on=".id")。

标签： r dataframe cbind

【解决方案1】：

在 rbindlist 之前，您可以使用 mapply 为每个数据帧提供一个外部 id 向量

my_list <- mapply(`[<-`, my_list, 'colname', value = list_root , SIMPLIFY = FALSE)

然后，只需 rbind 所有这些。

【讨论】：

你可以通过names(my_list) <- character_vector_of_names在一行中设置一个列表的所有名称。不需要mapply。

【解决方案2】：

我的回答是：

我们使用 .id 作为键来将两个数据框/数据表合并在一起。

对于list_root，我们先将其转成datatable格式，然后添加一个“.id”列，这样我们就有了一个key：

list_root <- as.data.table(list_root)[, .id := seq(1, nrow(list_root),1)]

接下来我可以在 list_df_A 上使用 rbindlist：

list_df_A <- rbindlist(list_df_A, use.names=TRUE, fill=TRUE, idcol=TRUE)

现在两者都有一个共同的密钥“.id”，我们可以执行合并：

new_dt <- merge(list_root, list_df_A, on=".id")

我们得到了所需的结果：

       V1       a     b     c    .id
 1: type1      a1    a2    a3      1
 2: type2     a21   a21   a21      2
 3: type2     b22   b22   b22      2
 4: type2     c23   c23   c23      2
 5: type3      a3    b3    c3      3

【讨论】：

【解决方案3】：

您设置列表的名称，然后使用idcol 参数，如下所示：

names(list_df_A) <- list_root
rbindlist(list_df_A,idcol = "id")

【讨论】：