【发布时间】:2015-02-03 10:17:33
【问题描述】:
其实我也有同样的问题strsplit one column with exact information into two column
这个问题已经解决了,只是我的数据看起来像
SNP Geno AlleleA AlleleB AlleleC AlleleD AlleleE
1 marker1 G1 AA AA AA AA AA
2 marker2 G1 TT TT TT TT TT
3 marker3 G1 TT TT TT TT TT
4 marker1 G2 CC CC CC CC CC
5 marker2 G2 AA AA AA AA AA
6 marker3 G2 TT TT TT TT TT
7 marker1 G3 GG GG GG GG GG
8 marker2 G3 AA AA AA AA AA
9 marker3 G3 TT TT TT TT TT
输入输出:
structure(list(SNP = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L), .Label = c("marker1", "marker2", "marker3"), class = "factor"),
Geno = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("G1",
"G2", "G3"), class = "factor"), AlleleA = structure(c(1L,
4L, 4L, 2L, 1L, 4L, 3L, 1L, 4L), .Label = c("AA", "CC", "GG",
"TT"), class = "factor"), AlleleB = structure(c(1L, 4L, 4L,
2L, 1L, 4L, 3L, 1L, 4L), class = "factor", .Label = c("AA",
"CC", "GG", "TT")), AlleleC = structure(c(1L, 4L, 4L, 2L,
1L, 4L, 3L, 1L, 4L), class = "factor", .Label = c("AA", "CC",
"GG", "TT")), AlleleD = structure(c(1L, 4L, 4L, 2L, 1L, 4L,
3L, 1L, 4L), class = "factor", .Label = c("AA", "CC", "GG",
"TT")), AlleleE = structure(c(1L, 4L, 4L, 2L, 1L, 4L, 3L,
1L, 4L), class = "factor", .Label = c("AA", "CC", "GG", "TT"
))), .Names = c("SNP", "Geno", "AlleleA", "AlleleB", "AlleleC",
"AlleleD", "AlleleE"), row.names = c(NA, -9L), class = "data.frame")
在那个问题上,他只有一列想要拆分为两列。问题是我有 5000 列(AlleleA、AlleleB ......等)想要拆分(每一列到两列)
我尝试过像这样使用循环,但它不起作用,
for(i in colnames(dat)){
dat1 <- data.frame(do.call(rbind, strsplit(as.vector(sprintf("dat$%s",i)), split = "")))
}
我会等待你的光, 谢谢你
【问题讨论】:
-
你想如何拆分列? (每一列正好在两列中,拆分是如何定义的?)。在 tidyr 中有一个
separate函数,它将一列拆分为多列,您可以使用例如 dplyr 的mutate_each函数将其应用于要拆分的每一列.. -
@beginneR 我已经修改了我的问题
-
@beginneR 它的作品使用 splitstackshape :) 感谢 Ananda Mahto