重塑数据表以使列名变为行名答案

【问题标题】：Reshaping data table to make column names into row names重塑数据表以使列名变为行名
【发布时间】：2015-06-18 10:33:27
【问题描述】：

我在R 中有一个data.table

> dt
  SAMPLE   junction count
1: R1        a       1
2: R2        a       1
3: R3        b       1
4: R3        a       1
5: R1        c       2

现在我想“重塑”数据表以形成data framem（基本上由样本矩阵与索引值连接为对应的计数值）。另外，请注意对于dt 中不存在的(SAMPLE,junction) 对，我假设相应的count 值为zero。有人可以帮助我如何实现这一目标吗？

> m
      R1   R2   R3
  a    1    1    1
  b    0    0    1
  c    2    0    0

【问题讨论】：

标签： r data.table reshape reshape2

【解决方案1】：

data.table 中的 dcast 将数据集从“长”格式更改为“宽”格式。

library(data.table)#v1.9.5+
dcast(dt, junction~SAMPLE, value.var='count', fill=0)
#   junction R1 R2 R3
#1:        a  1  1  1
#2:        b  0  0  1
#3:        c  2  0  0

如果需要矩阵输出

library(reshape2)
acast(dt, junction~SAMPLE, value.var='count', fill=0)
#   R1 R2 R3
#a  1  1  1
#b  0  0  1
#c  2  0  0

或xtabs 来自base R

 xtabs(count~junction+SAMPLE, dt)

【讨论】：

当junction列是数据框的行索引时，你会怎么做？或者当您拥有m 数据框并且您希望行成为列并且列成为行时。
@capm 和 data.table，当您使用 setDT 创建 data.table 时，如果您的意思是这样，可以选择保留 row.names setDT(dt, keep.rownames = TRUE)
我在想更多类似acast(m, colnames(m) ~ row.names(m)) 的东西来获得a <- data.frame(c(1,1,1), c(0,0,1), c(2,0,0), row.names = c('R1', 'R2', 'R3')) 和colnames(a) <- c('a', 'b', 'c')。
@capm 您对“a”数据集的期望是什么
从m <- data.frame(c(1,0,2), c(1,0,0), c(1,1,0), row.names = c('a', 'b', 'c')) 和colnames(m) <- c('R1', 'R2', 'R3')，使用acast(m, colnames(m) ~ row.names(m)) 之类的东西，我想得到数据框：a <- data.frame(c(1,1,1), c(0,0,1), c(2,0,0), row.names = c('R1', 'R2', 'R3')) 和colnames(a) <- c('a', 'b', 'c')。

【解决方案2】：

使用来自tidyr 的spread 的替代方法：

library(tidyr)

spread(dt, SAMPLE, count, fill=0)
#   junction R1 R2 R3
#1:        a  1  1  1
#2:        b  0  0  1
#3:        c  2  0  0

或者来自stats的reshape的老派解决方案：

reshape(dt, timevar='SAMPLE', idvar=c('junction'), direction='wide')
#   junction count.R1 count.R2 count.R3
#1:        a        1        1        1
#2:        b       NA       NA        1
#3:        c        2       NA       NA

数据：

dt = structure(list(SAMPLE = c("R1", "R2", "R3", "R3", "R1"), junction = c("a", 
"a", "b", "a", "c"), count = c(1, 1, 1, 1, 2)), .Names = c("SAMPLE", 
"junction", "count"), row.names = c(NA, -5L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x05e924a0>)

【讨论】：