【问题标题】:Search in character string with list of strings and return match使用字符串列表在字符串中搜索并返回匹配
【发布时间】:2018-10-15 22:43:03
【问题描述】:

在 R 中,我想按照标题所说的那样做。在字符列中搜索并返回匹配的单词

As.data.frame(
    c("yellow carrot","big car","green tomato","orange car","fertile goat","red snapper")
    )

还有

c("yellow","red","orange","green","blue")

我想回来

As.data.frame(
    cbind(
        c("yellow carrot","big car","green tomato","orange car","fertile goat","red snapper"),
        c("yellow","NA","green","orange","NA","red")
        )

【问题讨论】:

  • 奇怪的大写是怎么回事? R 区分大小写,因此不是有效的 R 代码。
  • 试试stringr::str_extract(df1[[1]], paste(vec1, collapse="|"))
  • 对不起 - 写在手机上

标签: r string dataframe character


【解决方案1】:

我们可以使用str_extract来获取匹配的子串

library(stringr)
df1$new <- str_extract(df1[[1]], paste(vec1, collapse="|")) 
df1$new
#[1] "yellow" NA       "green"  "orange" NA       "red"   

数据

vec1 <- c("yellow","red","orange","green","blue")
df1 <- data.frame(col1 = c("yellow carrot","big car",
  "green tomato","orange car","fertile goat","red snapper"))

【讨论】:

    【解决方案2】:

    使用dplyrifelse 语句,并且在颜色不在字符串开头时有效。

    data.frame(
        vary_1 = c(
            "yellow carrot",
            "big car",
            "green tomato",
            "orange car",
            "fertile goat",
            "red snapper"
        )
    ) %>%
        mutate(new = ifelse(grepl('yellow', .$vary_1),'yellow',
            ifelse(grepl('green', .$vary_1),'green',
                ifelse(grepl('red', .$vary_1),'red',
                       ifelse(grepl('orange',.$vary_1),'orange',
                NA
            )))))
        )
    
             vary_1    new
    1 yellow carrot yellow
    2       big car   <NA>
    3  green tomato  green
    4    orange car orange
    5  fertile goat   <NA>
    6   red snapper    red
    

    【讨论】:

      【解决方案3】:

      使用grepl 的基本 R 解决方案:

      # Sample data
      df <- data.frame(V1 = c("yellow carrot","big car","green tomato","orange car","fertile goat","red snapper"))
      s <- c("yellow","red","orange","green","blue")
      
      df$new <- apply(df, 1, function(x)
          ifelse(length(ret <- s[sapply(s, function(y) grepl(y, x))]) > 0, ret, NA))
      df;
      #             V1    new
      #1 yellow carrot yellow
      #2       big car   <NA>
      #3  green tomato  green
      #4    orange car orange
      #5  fertile goat   <NA>
      #6   red snapper    red
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-02-18
        • 1970-01-01
        • 2021-07-15
        • 2019-09-03
        • 1970-01-01
        相关资源
        最近更新 更多