【问题标题】:splitting matrix of strings into numeric matrix in R将字符串矩阵拆分为R中的数字矩阵
【发布时间】:2021-06-11 19:49:19
【问题描述】:

我有一个(大)由两个数字组成的字符串矩阵。我想将矩阵转换为数字矩阵,其中字符串被分成不同的列,同时保留原始矩阵的行顺序和字符串顺序。我在 R 中执行此操作,其中我主要是新手/自学成才。下面的示例输入和所需输出

input
test <- matrix[1:5,1:5]
test
     col1    col2   col3   col4   col5
row1 "0,0"   "0,0"  "0,0"  "0,0"  "0,1"            
row2 "0,0"   "0,0"  "0,0"  "0,0"  "0,0"            
row3 "0,0"   "0,0"  "0,2"  "0,0"  "0,0"            
row4 "0,0"   "0,0"  "0,2"  "0,0"  "0,0"            
row5 "0,0"   "0,0"  "0,0"  "0,0"  "1,0"    

desired output
      col1 col1.1 col2  col2.1 col3  col3.1 col4  col4.1 col5  col5.1
row1  0    0      0     0      0     0      0     0      0     1            
row2  0    0      0     0      0     0      0     0      0     0            
row3  0    0      0     0      0     2      0     0      0     0            
row4  0    0      0     0      0     2      0     0      0     0            
row5  0    0      0     0      0     0      0     0      1     0    

到目前为止,我已经尝试使用 strsplit 和 lapply/unlist,我可以制作一个不保留原始矩阵结构的矩阵,但我需要下游应用程序的原始结构。

bad output
> matrix(as.numeric(unlist(strsplit(test,","))),nrow=nrow(test))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    0    0    0    0    0    2    0    0    0     0
[2,]    0    0    0    0    0    0    0    0    1     0
[3,]    0    0    0    0    0    2    0    0    0     0
[4,]    0    0    0    0    0    0    0    0    0     1
[5,]    0    0    0    0    0    0    0    0    0     0

【问题讨论】:

    标签: r string matrix


    【解决方案1】:

    这是base R 中的一个选项,每列都有read.csv

    do.call(cbind, lapply(test, function(x) read.csv(text = x, header = FALSE)))
    

    -输出

    col1.V1 col1.V2 col2.V1 col2.V2 col3.V1 col3.V2 col4.V1 col4.V2 col5.V1 col5.V2
    1       0       0       0       0       0       0       0       0       0       1
    2       0       0       0       0       0       0       0       0       0       0
    3       0       0       0       0       0       2       0       0       0       0
    4       0       0       0       0       0       2       0       0       0       0
    5       0       0       0       0       0       0       0       0       1       0
    

    如果是matrix,可以使用applyMARGIN = 2循环遍历列并使用read.csv读取数据

    do.call(cbind, apply(as.matrix(test), 2, function(x) 
           read.csv(text = x, header = FALSE)))
    

    数据

    test <- structure(list(col1 = c("0,0", "0,0", "0,0", "0,0", "0,0"), 
          col2 = c("0,0", 
    "0,0", "0,0", "0,0", "0,0"), col3 = c("0,0", "0,0", "0,2", "0,2", 
    "0,0"), col4 = c("0,0", "0,0", "0,0", "0,0", "0,0"), col5 = c("0,1", 
    "0,0", "0,0", "0,0", "1,0")), class = "data.frame", row.names = c("row1", 
    "row2", "row3", "row4", "row5"))
    

    注意:这里,我们使用初始输入数据为data.frame

    【讨论】:

      【解决方案2】:

      read.csv 的另一个基本 R 选项

      > read.csv(text = do.call(paste,c(asplit(test,2),sep = ",")), header = FALSE)
        V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
      1  0  0  0  0  0  0  0  0  0   1
      2  0  0  0  0  0  0  0  0  0   0
      3  0  0  0  0  0  2  0  0  0   0
      4  0  0  0  0  0  2  0  0  0   0
      5  0  0  0  0  0  0  0  0  1   0
      

      数据

      > dput(test)
      structure(c("0,0", "0,0", "0,0", "0,0", "0,0", "0,0", "0,0", 
      "0,0", "0,0", "0,0", "0,0", "0,0", "0,2", "0,2", "0,0", "0,0",
      "0,0", "0,0", "0,0", "0,0", "0,1", "0,0", "0,0", "0,0", "1,0"
      ), .Dim = c(5L, 5L), .Dimnames = list(c("row1", "row2", "row3",
      "row4", "row5"), c("col1", "col2", "col3", "col4", "col5")))
      

      【讨论】:

        猜你喜欢
        • 2011-07-18
        • 1970-01-01
        • 1970-01-01
        • 2014-08-09
        • 1970-01-01
        • 2015-03-16
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多