通过组合行名和列名重塑二维矩阵（data.frame）答案

【问题标题】：Reshape 2d matrix(data.frame) by combining row and column names通过组合行名和列名重塑二维矩阵（data.frame）
【发布时间】：2020-01-19 12:03:35
【问题描述】：

问题

在编写论文时，我有一个有趣的任务要做。我有一个像这样的2D-matrix（或data.frame）：

     CACE        cheng     cheng2        ding    ding_ass        sun2
mean    0 -0.000467158 0.01219119 0.004284223 0.003803375 0.004204354
sd      0  0.131911914 0.14457078 0.074447198 0.055980336 0.072260046
            sun3
mean 0.004202419
sd   0.072266683

上面的矩阵描述了几个模型的性能（它们的mean 和sd）。 我想将它们列在我的论文中，所以我需要像这样重塑它们：

     CACE_mean CACE_sd   cheng_mean  cheng_sd cheng2_mean cheng2_sd
[1,]         0       0 -0.000467158 0.1319119  0.01219119 0.1445708
       ding_mean   ding_sd ding_ass_mean ding_ass_sd   sun2_mean
[1,] 0.004284223 0.0744472   0.003803375  0.05598034 0.004204354
        sun2_sd   sun3_mean    sun3_sd
[1,] 0.07226005 0.004202419 0.07226668

这就像将matrix 或data.frame 展平，但不是传统的long 到wide 整形任务。我想知道我们是否可以使用高级函数来做到这一点。

数据

原始数据（输入）：

structure(c(0, 0, -0.000467157971792085, 0.131911914238178, 0.0121911908647192, 
0.144570781843054, 0.00428422254646622, 0.0744471979273107, 0.00380337457776962, 
0.0559803359990803, 0.00420435426517323, 0.0722600458117494, 
0.00420241918783969, 0.0722666828398023), .Dim = c(2L, 7L), .Dimnames = list(
    c("mean", "sd"), c("CACE", "cheng", "cheng2", "ding", "ding_ass", 
    "sun2", "sun3")))

我的尝试

new_names = c(outer(row.names(a),colnames(a),function(x,y){paste(y,x,sep = '_')}))
new_data = t(data.frame(c(a),row.names = new_names))
rownames(new_data) <- NULL

效果很好，但我想知道一些其他的想法。

【问题讨论】：

标签： r tidyverse tidyr reshape2

【解决方案1】：

您可以根据模式concatenate 单元格和setNames。

setNames(do.call(c, as.data.frame(dat)),
         paste(rep(colnames(dat), each=2), rownames(dat), sep=".")
)
#   CACE.mean       CACE.sd    cheng.mean      cheng.sd   cheng2.mean     cheng2.sd     ding.mean 
# 0.000000000   0.000000000  -0.000467158   0.131911914   0.012191191   0.144570782   0.004284223 
#     ding.sd ding_ass.mean   ding_ass.sd     sun2.mean       sun2.sd     sun3.mean       sun3.sd 
# 0.074447198   0.003803375   0.055980336   0.004204354   0.072260046   0.004202419   0.072266683

【讨论】：

【解决方案2】：

我们可以将矩阵转换为数据框，将行名作为单独的列，并将数据转换为更广泛的格式。

df %>%
  as.data.frame() %>%
  tibble::rownames_to_column() %>%
  tidyr::pivot_wider(names_from = rowname, values_from = -rowname)

#   CACE_mean CACE_sd cheng_mean cheng_sd cheng2_mean cheng2_sd ding_mean ding_sd
#      <dbl>   <dbl>      <dbl>    <dbl>       <dbl>     <dbl>     <dbl>   <dbl>
#1         0       0  -0.000467    0.132      0.0122     0.145   0.00428  0.0744
# … with 6 more variables: ding_ass_mean <dbl>, ding_ass_sd <dbl>, sun2_mean <dbl>,
#   sun2_sd <dbl>, sun3_mean <dbl>, sun3_sd <dbl>

【讨论】：

【解决方案3】：

一种类似于你的方法，但更紧凑

m <- `colnames<-`(t(c(a)),c(t(outer(colnames(a),paste0("_",rownames(a)),paste0))))

这样

> m
     CACE_mean CACE_sd   cheng_mean  cheng_sd cheng2_mean cheng2_sd   ding_mean   ding_sd
[1,]         0       0 -0.000467158 0.1319119  0.01219119 0.1445708 0.004284223 0.0744472
     ding_ass_mean ding_ass_sd   sun2_mean    sun2_sd   sun3_mean    sun3_sd
[1,]   0.003803375  0.05598034 0.004204354 0.07226005 0.004202419 0.07226668

【讨论】：

【解决方案4】：

用as.data.frame.table转换后可以使用unite

library(dplyr)
library(tidyr)
as.data.frame.table(m1) %>%
   unite(Var1, Var2, Var1) %>%
   spread(Var1, Freq)
#    CACE_mean CACE_sd   cheng_mean  cheng_sd cheng2_mean cheng2_sd ding_ass_mean ding_ass_sd   ding_mean   ding_sd   sun2_mean    sun2_sd
#1         0       0 -0.000467158 0.1319119  0.01219119 0.1445708   0.003803375  0.05598034 0.004284223 0.0744472 0.004204354 0.07226005
#    sun3_mean    sun3_sd
#1 0.004202419 0.07226668

【讨论】：