【问题标题】:R using gsub in a loopR 在循环中使用 gsub
【发布时间】:2018-04-13 09:36:52
【问题描述】:

我有以下列名向量:

plot_variables <- c("Ser predicted (g/L)", "Ser initial (g/L)", "Ser experimental (g/L)", "Glu predicted (g/L)", "Glu initial (g/L)", "Glu experimental (g/L)", Pro predicted (g/L), ...)

我有这些短名称的词汇表:

df_glossary <- data.frame(
  short = c("Cys", "Pro", "Phe", "Ser", "Glu", "Glc", ...),
  full = c("Cysteine", "Proline", "Phenylalanine", "Serine", "Glutamate", "Glucose", ...),
  stringsAsFactors = FALSE
)

我想匹配这两个并有类似的东西:

names_matching <- data.frame(
variable = c("Ser predicted (g/L)", "Ser initial (g/L)", "Ser experimental (g/L)", ...),
label = c("Serine predicted (g/L)", "Serine initial (g/L)", "Serine experimental (g/L)", ...)
)

还有比这更优雅的方法吗:

pl<-unlist(plot_variables)

pl<-sapply(1:nrow(df_glossary) , function(x){
    pl<<- gsub(df_glossary$short[x], df_glossary$full[x],  pl, fixed = TRUE)
    })

pl <- pl[,nrow(df_glossary)] %>% data.frame()

names_matching <- cbind(plot_variables %>% data.frame, pl)

【问题讨论】:

    标签: r loops gsub


    【解决方案1】:

    我认为您正在寻找的是 gsubfngsubfn 包中。如果您想从另一个数据帧中读取键和值,则需要进行一些争论,但总的来说,它是这样工作的:

    > library(gsubfn)
    > gsubfn('[Ser|Glu|Pro]*', 
         list('Ser'='Serine','Glu'='Glutamate','Pro'='Proline'), plot_variables)
    [1] "Serine predicted (g/L)"       "Serine initial (g/L)"        
    [3] "Serine experimental (g/L)"    "Glutamate predicted (g/L)"   
    [5] "Glutamate initial (g/L)"      "Glutamate experimental (g/L)"
    [7] "Proline predicted (g/L)"     
    

    【讨论】:

      【解决方案2】:

      我不确定我是否理解了这个问题,这行得通吗?

      df_glossary <- data.frame(
        shortnames = c("Cys", "Pro", "Phe", "Ser", "Glu", "Glc"),
        full = c("Cysteine", "Proline", "Phenylalanine", "Serine", "Glutamate", "Glucose"),
        stringsAsFactors = FALSE
      )
      plot_variables <- c("Ser predicted (g/L)", "Ser initial (g/L)", "Ser experimental (g/L)", "Glu predicted (g/L)", "Glu initial (g/L)", "Glu experimental (g/L)", "Pro predicted (g/L)")
      suffixes = c("predicted (g/L)", "initial (g/L)", "experimental (g/L)")
      
      df_glossary %>% rowwise %>% 
          do(data.frame(short=.$short, full=.$full, suffix=suffixes )) %>%
          mutate(label=paste(full, suffix))
      
      short   full    suffix  label
      Cys Cysteine    predicted (g/L) Cysteine predicted (g/L)
      Cys Cysteine    initial (g/L)   Cysteine initial (g/L)
      Cys Cysteine    experimental (g/L)  Cysteine experimental (g/L)
      Pro Proline predicted (g/L) Proline predicted (g/L)
      Pro Proline initial (g/L)   Proline initial (g/L)
      Pro Proline experimental (g/L)  Proline experimental (g/L)
      Phe Phenylalanine   predicted (g/L) Phenylalanine predicted (g/L)
      Phe Phenylalanine   initial (g/L)   Phenylalanine initial (g/L)
      Phe Phenylalanine   experimental (g/L)  Phenylalanine experimental (g/L)
      Ser Serine  predicted (g/L) Serine predicted (g/L)
      Ser Serine  initial (g/L)   Serine initial (g/L)
      Ser Serine  experimental (g/L)  Serine experimental (g/L)
      Glu Glutamate   predicted (g/L) Glutamate predicted (g/L)
      Glu Glutamate   initial (g/L)   Glutamate initial (g/L)
      Glu Glutamate   experimental (g/L)  Glutamate experimental (g/L)
      Glc Glucose predicted (g/L) Glucose predicted (g/L)
      Glc Glucose initial (g/L)   Glucose initial (g/L)
      Glc Glucose experimental (g/L)  Glucose experimental (g/L)
      

      【讨论】:

      • 不完全是,您的建议创建了所有可能的选项,而我非常想仅将现有列从 Ser predicted 重命名为 Serine predicted 等等
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-11-19
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多