【问题标题】:Select rows based on string pattern in R根据R中的字符串模式选择行
【发布时间】:2021-04-01 09:04:22
【问题描述】:

假设我有下一个数据:

df <- data.frame(name = c("TO for", "Turnover for people", "HC people", 
                          "Hello world", "beenie man", 
                          "apple", "pears", "TO is"),
                 number = c(1, 2, 3, 4, 5, 6, 7, 8))

我想根据行字符串模式过滤 df,如果 name 列的行以 c("TO", "Turnover", "HC") 开头,则过滤 else remove。

下面的代码给了我一个警告信息:

library(data.table)
test <- df[df$name %like% c("TO", "Turnover", "HC"), ]

控制台输出:

Warning message:
In grepl(pattern, vector, ignore.case = ignore.case, fixed = fixed) :
  el argumento 'pattern' tiene tiene longitud > 1 y sólo el primer elemento será usado

预期输出应如下所示:

# name                   number
# TO for                   1
# Turnover for people      2
# HC people                3
# TO is                    8   

有没有其他方法可以做到这一点?

【问题讨论】:

    标签: r string dataframe filter


    【解决方案1】:

    %like% 未矢量化。我们可能需要将 pattern vectorReduce 循环到单个逻辑向量

    i1 <- Reduce(`|`, lapply(c("TO", "Turnover", "HC"), `%like%`, vector = df$name))
     df[i1,]
    #                 name number
    #1              TO for      1
    #2 Turnover for people      2
    #3           HC people      3
    #8               TO is      8
    

    或者这可以使用grepl 来实现,方法是将vector 折叠成带有| 的单个字符串

    pat <- paste(c("TO", "Turnover", "HC"), collapse= "|")
    df[grepl(pat, df$name),]
    #                 name number
    #1              TO for      1
    #2 Turnover for people      2
    #3           HC people      3
    #8               TO is      8
    

    或者也可以在%like%中使用

    df[df$name %like% pat,]
    

    【讨论】:

      猜你喜欢
      • 2014-10-01
      • 2018-01-14
      • 1970-01-01
      • 2020-11-07
      • 2016-12-03
      • 1970-01-01
      • 2020-11-21
      • 1970-01-01
      • 2021-12-25
      相关资源
      最近更新 更多