【问题标题】:How to extract the remaining substring of a string [duplicate]如何提取字符串的剩余子字符串[重复]
【发布时间】:2021-05-10 14:12:30
【问题描述】:

在“a”列中,我有完整的单词,在单词“b”中,我有一个单词的子字符串。我如何找到剩余的子字符串。理想的结果应该类似于“column_to_extract”列。

a <- c("clean", "player", "rubbish", "lock")
b<- c("ean", "er", "bbish", "ck")

df1 <- data.frame(a,b)


结果应该是:

        a     b column_to_extract
1   clean   ean                cl
2  player    er              play
3 rubbish bbish                ru
4    lock    ck                lo

【问题讨论】:

    标签: r string dplyr


    【解决方案1】:

    您可以使用矢量化的stringr 中的str_remove。它类似于将str_replace 替换为空字符串。 ("")。

    library(dplyr)
    library(stringr)
    
    df1 %>%
      mutate(column_to_extract = str_remove(a, b), 
             column_to_extract2 = str_replace(a, b, ""))
    
    #        a     b column_to_extract column_to_extract2
    #1   clean   ean                cl                 cl
    #2  player    er              play               play
    #3 rubbish bbish                ru                 ru
    #4    lock    ck                lo                 lo
    

    【讨论】:

    • 哎呀!我刚刚发布了类似的行:)
    【解决方案2】:
    library(tidyverse)
    df1 %>% mutate(column_to_extract = str_replace(a, b, ""))
            a     b column_to_extract
    1   clean   ean                cl
    2  player    er              play
    3 rubbish bbish                ru
    4    lock    ck                lo
    

    【讨论】:

      【解决方案3】:

      Base R 选项是:

      # Split into list of strsplit row vectors
      list <- apply(df1, 1, strsplit, split = "")
      
      # Find those that do not match and return 
      cbind(df1, "column_to_extract" = sapply(list, function(x){
        paste(x[[1]][!(x[[1]] %in% x[[2]])], collapse = "")
      }))
      
      # Yields
      #        a     b column_to_extract
      #1   clean   ean                cl
      #2  player    er              play
      #3 rubbish bbish                ru
      #4    lock    ck                lo
      

      【讨论】:

      • 喜欢你的用户名:)
      猜你喜欢
      • 1970-01-01
      • 2018-10-16
      • 1970-01-01
      • 2020-11-17
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-05-08
      相关资源
      最近更新 更多