【问题标题】:Entering data into multiple dataframes将数据输入到多个数据框中
【发布时间】:2020-02-16 17:37:27
【问题描述】:

我有两个循环,第一个查看特定蛋白质,然后第二个查看特定细胞。此外,我有 16 个表(全部命名为蛋白质,然后是“_table”)。如果我指定单元格类型,我可以让数据输入正确的行,但如果我尝试paste0(temp_TSPAN, "_table"),我会收到错误incorrect number of subscripts on matrix

有什么想法可以让我的循环指定正确的表吗?

这是第二个循环,用于将数据放入表中:

    temp_cell <- xCell_cells[i]
    print(temp_TSPAN)
    print(temp_cell)
    temp_means <- c(mean(xCell_Lum_A_Q1[,temp_cell]),mean(xCell_Lum_A_Q2[,temp_cell]),mean(xCell_Lum_A_Q3[,temp_cell]),mean(xCell_Lum_A_Q4[,temp_cell]),
                    mean(xCell_Lum_B_Q1[,temp_cell]),mean(xCell_Lum_B_Q2[,temp_cell]),mean(xCell_Lum_B_Q3[,temp_cell]),mean(xCell_Lum_B_Q4[,temp_cell]),
                    mean(xCell_Her_2_Q1[,temp_cell]),mean(xCell_Her_2_Q2[,temp_cell]),mean(xCell_Her_2_Q3[,temp_cell]),mean(xCell_Her_2_Q4[,temp_cell]),
                    mean(xCell_Basal_Q1[,temp_cell]),mean(xCell_Basal_Q2[,temp_cell]),mean(xCell_Basal_Q3[,temp_cell]),mean(xCell_Basal_Q4[,temp_cell]),
                    mean(xCell_Normal_Q1[,temp_cell]),mean(xCell_Normal_Q2[,temp_cell]),mean(xCell_Normal_Q3[,temp_cell]),mean(xCell_Normal_Q4[,temp_cell]))

    print(temp_means)
    paste0(temp_TSPAN, "_table")[temp_cell,] <- temp_means
    temp_means <- c()
  }

整个代码

library(dplyr)
library(RColorBrewer)
RNA_seq <- read.table("RNASeq2Norm_expr_BCRA.txt", stringsAsFactors = F)
xCell <- read.table("..../xCell_ES_RNAseq.txt", stringsAsFactors = F)
PAM50 <- read.table("..../PAM50_subtypes.txt", stringsAsFactors = F)
TSPANS <- read.table("..../TSPANS.txt", stringsAsFactors = F)
len_TSPAN <- length(TSPANS$V1)
col <- brewer.pal(4, "Pastel1")


xCell_cells <- rownames(xCell)
#Create table for quartile means to be entered
for (i in seq(1,len_TSPAN)){
  temp_TSPAN <- TSPANS$V1[i]
  print(temp_TSPAN)
  assign(as.character((temp_TSPAN)), value = data.frame(Lum_A_Q1_means = rep(NA, 67), Lum_A_Q2_means = rep(NA,67),
                                                      Lum_A_Q3_means = rep(NA, 67), Lum_A_Q4_means = rep(NA,67),
                                                      Lum_B_Q1_means = rep(NA, 67), Lum_B_Q2_means = rep(NA,67),
                                                      Lum_B_Q3_means = rep(NA, 67), Lum_B_Q4_means = rep(NA,67),
                                                      Her_2_Q1_means = rep(NA, 67), Her_2_Q2_means = rep(NA,67),
                                                      Her_2_Q3_means = rep(NA, 67), Her_2_Q4_means = rep(NA,67),
                                                      Basal_Q1_means = rep(NA, 67), Basal_Q2_means = rep(NA,67),
                                                      Basal_Q3_means = rep(NA, 67), Basal_Q4_means = rep(NA,67),
                                                      Normal_Q1_means = rep(NA, 67), Normal_Q2_means = rep(NA,67),
                                                      Normal_Q3_means = rep(NA, 67), Normal_Q4_means = rep(NA,67),
                                                      row.names = xCell_cells))
  }

temp_TSPAN <- c()
temp_cell <- c()
#Determine which samples belong to each quartile
for (T in seq(1,len_TSPAN)) {
  temp_TSPAN <- TSPANS$V1[T]
  print(temp_TSPAN)
  Lum_A <- RNA_seq[temp_TSPAN, PAM50$subtype == "LumA"]
  Lum_A_Quartiles <- quantile(Lum_A[temp_TSPAN,])
  Q1_Lum_A <- Lum_A[,(Lum_A[temp_TSPAN,]) <= Lum_A_Quartiles$`25%`]
  Q2_Lum_A <- Lum_A[,(Lum_A[temp_TSPAN,]) > Lum_A_Quartiles$`25%`]
  Q2_Lum_A <- Q2_Lum_A[,(Q2_Lum_A[temp_TSPAN,]) <= Lum_A_Quartiles$`50%`]
  Q3_Lum_A <- Lum_A[,(Lum_A[temp_TSPAN,]) > Lum_A_Quartiles$`50%`]
  Q3_Lum_A <- Q3_Lum_A[,(Q3_Lum_A[temp_TSPAN,]) <= Lum_A_Quartiles$`75%`]
  Q4_Lum_A <- Lum_A[,(Lum_A[temp_TSPAN,]) > Lum_A_Quartiles$`75%`]

  Lum_B <- RNA_seq[temp_TSPAN, PAM50$subtype == "LumB"]
  Lum_B_Quartiles <- quantile(Lum_B[temp_TSPAN,])
  Q1_Lum_B <- Lum_B[,(Lum_B[temp_TSPAN,]) <= Lum_B_Quartiles$`25%`]
  Q2_Lum_B <- Lum_B[,(Lum_B[temp_TSPAN,]) > Lum_B_Quartiles$`25%`]
  Q2_Lum_B <- Q2_Lum_B[,(Q2_Lum_B[temp_TSPAN,]) <= Lum_B_Quartiles$`50%`]
  Q3_Lum_B <- Lum_B[,(Lum_B[temp_TSPAN,]) > Lum_B_Quartiles$`50%`]
  Q3_Lum_B <- Q3_Lum_B[,(Q3_Lum_B[temp_TSPAN,]) <= Lum_B_Quartiles$`75%`]
  Q4_Lum_B <- Lum_B[,(Lum_B[temp_TSPAN,]) > Lum_B_Quartiles$`75%`]

  Her_2 <- RNA_seq[temp_TSPAN, PAM50$subtype == "Her2"]
  Her_2_Quartiles <- quantile(Her_2[temp_TSPAN,])
  Q1_Her_2 <- Her_2[,(Her_2[temp_TSPAN,]) <= Her_2_Quartiles$`25%`]
  Q2_Her_2 <- Her_2[,(Her_2[temp_TSPAN,]) > Her_2_Quartiles$`25%`]
  Q2_Her_2 <- Q2_Her_2[,(Q2_Her_2[temp_TSPAN,]) <= Her_2_Quartiles$`50%`]
  Q3_Her_2 <- Her_2[,(Her_2[temp_TSPAN,]) > Her_2_Quartiles$`50%`]
  Q3_Her_2 <- Q3_Her_2[,(Q3_Her_2[temp_TSPAN,]) <= Her_2_Quartiles$`75%`]
  Q4_Her_2 <- Her_2[,(Her_2[temp_TSPAN,]) > Her_2_Quartiles$`75%`]

  Basal <- RNA_seq[temp_TSPAN, PAM50$subtype == "Basal"]
  Basal_Quartiles <- quantile(Basal[temp_TSPAN,])
  Q1_Basal <- Basal[,(Basal[temp_TSPAN,]) <= Basal_Quartiles$`25%`]
  Q2_Basal <- Basal[,(Basal[temp_TSPAN,]) > Basal_Quartiles$`25%`]
  Q2_Basal <- Q2_Basal[,(Q2_Basal[temp_TSPAN,]) <= Basal_Quartiles$`50%`]
  Q3_Basal <- Basal[,(Basal[temp_TSPAN,]) > Basal_Quartiles$`50%`]
  Q3_Basal <- Q3_Basal[,(Q3_Basal[temp_TSPAN,]) <= Basal_Quartiles$`75%`]
  Q4_Basal <- Basal[,(Basal[temp_TSPAN,]) > Basal_Quartiles$`75%`]

  Normal <- RNA_seq[temp_TSPAN, PAM50$subtype == "Normal"]
  Normal_Quartiles <- quantile(Normal[temp_TSPAN,])
  Q1_Normal <- Normal[,(Normal[temp_TSPAN,]) <= Normal_Quartiles$`25%`]
  Q2_Normal <- Normal[,(Normal[temp_TSPAN,]) > Normal_Quartiles$`25%`]
  Q2_Normal <- Q2_Normal[,(Q2_Normal[temp_TSPAN,]) <= Normal_Quartiles$`50%`]
  Q3_Normal <- Normal[,(Normal[temp_TSPAN,]) > Normal_Quartiles$`50%`]
  Q3_Normal <- Q3_Normal[,(Q3_Normal[temp_TSPAN,]) <= Normal_Quartiles$`75%`]
  Q4_Normal <- Normal[,(Normal[temp_TSPAN,]) > Normal_Quartiles$`75%`]

  Lum_A_Q1_samples <- colnames(Q1_Lum_A)
  Lum_A_Q2_samples <- colnames(Q2_Lum_A)
  Lum_A_Q3_samples <- colnames(Q3_Lum_A)
  Lum_A_Q4_samples <- colnames(Q4_Lum_A)

  Lum_B_Q1_samples <- colnames(Q1_Lum_B)
  Lum_B_Q2_samples <- colnames(Q2_Lum_B)
  Lum_B_Q3_samples <- colnames(Q3_Lum_B)
  Lum_B_Q4_samples <- colnames(Q4_Lum_B)

  Her_2_Q1_samples <- colnames(Q1_Her_2)
  Her_2_Q2_samples <- colnames(Q2_Her_2)
  Her_2_Q3_samples <- colnames(Q3_Her_2)
  Her_2_Q4_samples <- colnames(Q4_Her_2)

  Basal_Q1_samples <- colnames(Q1_Basal)
  Basal_Q2_samples <- colnames(Q2_Basal)
  Basal_Q3_samples <- colnames(Q3_Basal)
  Basal_Q4_samples <- colnames(Q4_Basal)

  Normal_Q1_samples <- colnames(Q1_Normal)
  Normal_Q2_samples <- colnames(Q2_Normal)
  Normal_Q3_samples <- colnames(Q3_Normal)
  Normal_Q4_samples <- colnames(Q4_Normal)

  #Finding enrichment scores for the samples in each quartile
  xCell_Lum_A_Q1 <- t(xCell[,Lum_A_Q1_samples])
  xCell_Lum_A_Q2 <- t(xCell[,Lum_A_Q2_samples])
  xCell_Lum_A_Q3 <- t(xCell[,Lum_A_Q3_samples])
  xCell_Lum_A_Q4 <- t(xCell[,Lum_A_Q4_samples])

  xCell_Lum_B_Q1 <- t(xCell[,Lum_B_Q1_samples])
  xCell_Lum_B_Q2 <- t(xCell[,Lum_B_Q2_samples])
  xCell_Lum_B_Q3 <- t(xCell[,Lum_B_Q3_samples])
  xCell_Lum_B_Q4 <- t(xCell[,Lum_B_Q4_samples])

  xCell_Her_2_Q1 <- t(xCell[,Her_2_Q1_samples])
  xCell_Her_2_Q2 <- t(xCell[,Her_2_Q2_samples])
  xCell_Her_2_Q3 <- t(xCell[,Her_2_Q3_samples])
  xCell_Her_2_Q4 <- t(xCell[,Her_2_Q4_samples])

  xCell_Basal_Q1 <- t(xCell[,Basal_Q1_samples])
  xCell_Basal_Q2 <- t(xCell[,Basal_Q2_samples])
  xCell_Basal_Q3 <- t(xCell[,Basal_Q3_samples])
  xCell_Basal_Q4 <- t(xCell[,Basal_Q4_samples])

  xCell_Normal_Q1 <- t(xCell[,Normal_Q1_samples])
  xCell_Normal_Q2 <- t(xCell[,Normal_Q2_samples])
  xCell_Normal_Q3 <- t(xCell[,Normal_Q3_samples])
  xCell_Normal_Q4 <- t(xCell[,Normal_Q4_samples])

  len_xCell <- length(xCell_cells)
  temp_means <- c()
  for (i in seq(1, len_xCell)){
    temp_cell <- xCell_cells[i]
    print(temp_TSPAN)
    print(temp_cell)
    temp_means <- c(mean(xCell_Lum_A_Q1[,temp_cell]),mean(xCell_Lum_A_Q2[,temp_cell]),mean(xCell_Lum_A_Q3[,temp_cell]),mean(xCell_Lum_A_Q4[,temp_cell]),
                    mean(xCell_Lum_B_Q1[,temp_cell]),mean(xCell_Lum_B_Q2[,temp_cell]),mean(xCell_Lum_B_Q3[,temp_cell]),mean(xCell_Lum_B_Q4[,temp_cell]),
                    mean(xCell_Her_2_Q1[,temp_cell]),mean(xCell_Her_2_Q2[,temp_cell]),mean(xCell_Her_2_Q3[,temp_cell]),mean(xCell_Her_2_Q4[,temp_cell]),
                    mean(xCell_Basal_Q1[,temp_cell]),mean(xCell_Basal_Q2[,temp_cell]),mean(xCell_Basal_Q3[,temp_cell]),mean(xCell_Basal_Q4[,temp_cell]),
                    mean(xCell_Normal_Q1[,temp_cell]),mean(xCell_Normal_Q2[,temp_cell]),mean(xCell_Normal_Q3[,temp_cell]),mean(xCell_Normal_Q4[,temp_cell]))

    print(temp_means)
    nm1 <- temp_TSPAN, "_table")
    assign(nm1, `[<-`(get(nm1), get(nm1)[temp_cell,], temp_means))
    temp_means <- c()
  }
}

样本数据

> dput(head(TSPANS))
structure(list(V1 = c("TSPAN1", "TSPAN3", "TSPAN4", "TSPAN6", 
"TSPAN8", "TSPAN9")), row.names = c(NA, 6L), class = "data.frame")

> dput(head(PAM50))
structure(list(Sample_ID = c("TCGA.3C.AAAU.01A.11R.A41B.07", 
"TCGA.3C.AALI.01A.11R.A41B.07", "TCGA.3C.AALJ.01A.31R.A41B.07", 
"TCGA.3C.AALK.01A.11R.A41B.07", "TCGA.4H.AAAK.01A.12R.A41B.07", 
"TCGA.5L.AAT0.01A.12R.A41B.07"), subtype = c("LumB", "Her2", 
"LumB", "Her2", "LumB", "LumA")), row.names = c("1", "2", "3", 
"4", "5", "6"), class = "data.frame")

> dput(xCell[1:5, 1:5])
structure(list(TCGA.3C.AAAU.01A.11R.A41B.07 = c(0.0182278777214451, 
0, 0, 0.00312390016077943, 0.136068543973221), TCGA.3C.AALI.01A.11R.A41B.07 = c(0.282595778602895, 
0, 0.0600603500818251, 0.0589537608635649, 0.205506668589802), 
    TCGA.3C.AALJ.01A.31R.A41B.07 = c(0.18283171431184, 0.0941680866198556, 
    0.146150110122777, 0.0304405814585031, 8.9658687089931e-20
    ), TCGA.3C.AALK.01A.11R.A41B.07 = c(0.134145304728982, 0.032112973032126, 
    0.154386799682783, 0, 4.17812708486922e-20), TCGA.4H.AAAK.01A.12R.A41B.07 = c(0.106111324096064, 
    0.0121130054841642, 0.191944288358642, 0, 0.125099426066817
    )), row.names = c("aDC", "Adipocytes", "Astrocytes", "B-cells", 
"Basophils"), class = "data.frame")

> dput(RNA_seq[1:5, 1:5])
structure(list(TCGA.3C.AAAU.01A.11R.A41B.07 = c(197.0897, 0, 
0, 102.9634, 1.3786), TCGA.3C.AALI.01A.11R.A41B.07 = c(237.3844, 
0, 0, 70.8646, 4.3502), TCGA.3C.AALJ.01A.31R.A41B.07 = c(423.2366, 
0.9066, 0, 161.2602, 0), TCGA.3C.AALK.01A.11R.A41B.07 = c(191.0178, 
0, 0, 62.5072, 1.6549), TCGA.4H.AAAK.01A.12R.A41B.07 = c(268.8809, 
0.4255, 3.8298, 154.3702, 3.4043)), row.names = c("A1BG", "A1CF", 
"A2BP1", "A2LD1", "A2ML1"), class = "data.frame")

> dput(head(TSPAN1_table))
structure(list(Lum_A_Q1_means = c(NA, NA, NA, NA, NA, NA), Lum_A_Q2_means = c(NA, 
NA, NA, NA, NA, NA), Lum_A_Q3_means = c(NA, NA, NA, NA, NA, NA
), Lum_A_Q4_means = c(NA, NA, NA, NA, NA, NA), Lum_B_Q1_means = c(NA, 
NA, NA, NA, NA, NA), Lum_B_Q2_means = c(NA, NA, NA, NA, NA, NA
), Lum_B_Q3_means = c(NA, NA, NA, NA, NA, NA), Lum_B_Q4_means = c(NA, 
NA, NA, NA, NA, NA), Her_2_Q1_means = c(NA, NA, NA, NA, NA, NA
), Her_2_Q2_means = c(NA, NA, NA, NA, NA, NA), Her_2_Q3_means = c(NA, 
NA, NA, NA, NA, NA), Her_2_Q4_means = c(NA, NA, NA, NA, NA, NA
), Basal_Q1_means = c(NA, NA, NA, NA, NA, NA), Basal_Q2_means = c(NA, 
NA, NA, NA, NA, NA), Basal_Q3_means = c(NA, NA, NA, NA, NA, NA
), Basal_Q4_means = c(NA, NA, NA, NA, NA, NA), Normal_Q1_means = c(NA, 
NA, NA, NA, NA, NA), Normal_Q2_means = c(NA, NA, NA, NA, NA, 
NA), Normal_Q3_means = c(NA, NA, NA, NA, NA, NA), Normal_Q4_means = c(NA, 
NA, NA, NA, NA, NA)), row.names = c("aDC", "Adipocytes", "Astrocytes", 
"B-cells", "Basophils", "CD4+ memory T-cells"), class = "data.frame")

【问题讨论】:

标签: r dataframe for-loop


【解决方案1】:

我们需要使用get 获取值并使用assign 进行赋值

   temp_cell <- xCell_cells[i]
    print(temp_TSPAN)
    print(temp_cell)
    temp_means <- c(mean(xCell_Lum_A_Q1[,temp_cell]),mean(xCell_Lum_A_Q2[,temp_cell]),mean(xCell_Lum_A_Q3[,temp_cell]),mean(xCell_Lum_A_Q4[,temp_cell]),
                    mean(xCell_Lum_B_Q1[,temp_cell]),mean(xCell_Lum_B_Q2[,temp_cell]),mean(xCell_Lum_B_Q3[,temp_cell]),mean(xCell_Lum_B_Q4[,temp_cell]),
                    mean(xCell_Her_2_Q1[,temp_cell]),mean(xCell_Her_2_Q2[,temp_cell]),mean(xCell_Her_2_Q3[,temp_cell]),mean(xCell_Her_2_Q4[,temp_cell]),
                    mean(xCell_Basal_Q1[,temp_cell]),mean(xCell_Basal_Q2[,temp_cell]),mean(xCell_Basal_Q3[,temp_cell]),mean(xCell_Basal_Q4[,temp_cell]),
                    mean(xCell_Normal_Q1[,temp_cell]),mean(xCell_Normal_Q2[,temp_cell]),mean(xCell_Normal_Q3[,temp_cell]),mean(xCell_Normal_Q4[,temp_cell]))

    print(temp_means)
    nm1 <- temp_TSPAN, "_table")
    assign(nm1, `[<-`(get(nm1), get(nm1)[temp_cell,], temp_means))

    temp_means <- c()
  }

可能是 OP 正在寻找带有

的简化版本
lapply(split(RNA_seq, setNames(PAM50$subtype, PAM50$Sample_ID)[colnames(RNA_seq)]), 
      function(dat) apply(dat, 1, function(x) {
         qnt <- quantile(x)
         data.frame(val = names(x), grp = names(qnt)[findInterval(x, qnt)])
         apply(xCell[, names(x)], 2, function(y) tapply(y, names(x), FUN = mean))
       }))

【讨论】:

  • 我收到以下错误:```错误:意外','在:“ print(temp_means) nm1 assign(nm1, [&lt;-(get(nm1 ), get(nm1)[temp_cell,], temp_means) + temp_means [<-(get(nm1), get(nm1)[temp_cell,], temp_means ) temp_means" > } 错误:“}”中的意外'}'```
  • @cbkhia II 忘了关闭括号) 你现在可以试试吗
  • 另外,请展示一个可重复的小示例,以便对其进行测试
  • ` + nm1 assign(nm1, [&lt;-(get(nm1 ), get(nm1)[temp_cell,], temp_means)) get(nm1) 中的错误:找不到对象'_table' > temp_means } 错误:“}”中的意外'}'`
  • @cbkhia 你能不能用一个可重复的小例子更新你的帖子,因为我无法测试它
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2022-06-23
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2020-01-29
  • 2019-01-10
相关资源
最近更新 更多