【问题标题】:find location of last comma in a string in r [duplicate]在r中查找字符串中最后一个逗号的位置[重复]
【发布时间】:2019-06-19 22:58:48
【问题描述】:

我有一个数据框,其中有一列包含文本字符串:

1 Blue, Tall, leather, VA  
2 Green, Medium, VA*  
3 Pink, MD  
4 Yellow, MA  

最后 2 个,或者有时是 3 个带有“*”的是州名。我希望能够为每一行提取左侧或最后一个“,”的所有内容。在 r 中完成此任务的最佳方法是什么。

我是 r 新手,所以请帮忙

我希望输出是:

1 Blue, Tall, leather  
2 Green, Medium  
3 Pink  
4 Yellow

【问题讨论】:

    标签: r find location comma


    【解决方案1】:

    splitpaste 除了最后一个用逗号分隔的项目之外的所有内容

    vector <- c("Blue, Tall, leather, VA", "Green, Medium, VA*", "Pink, MD", "Yellow, MA")
    sapply(X = strsplit(x = vector, split = ","),
           FUN = function(x) paste(head(x, -1), collapse = ","))
    #[1] "Blue, Tall, leather" "Green, Medium"       "Pink"                "Yellow"    
    

    【讨论】:

      【解决方案2】:

      使用正则表达式:

      vector <- c("Blue, Tall, leather, VA", "Green, Medium, VA*", "Pink, MD", "Yellow, MA")
      
      sub("^(.*),.*$", "\\1", vector)
      

      【讨论】:

        【解决方案3】:

        带有sub 的选项与, 后跟零个或多个不是, ([^,]*) 的字符匹配,直到字符串的末尾($) 并替换为空白(@ 987654326@)

        sub(",[^,]*$", "", v1)
        #[1] "Blue, Tall, leather" "Green, Medium"       "Pink"                "Yellow"   
        

        或使用trimws(从R 3.6.0 开始)

        trimws(v1, whitespace = ",[^,]*")
        #[1] "Blue, Tall, leather" "Green, Medium"       "Pink"                "Yellow"   
        

        或与str_remove 来自stringr

        library(stringr)
        str_remove(v1, ",[^,]*$")
        

        数据

        v1 <- c("Blue, Tall, leather, VA", "Green, Medium, VA*", "Pink, MD", "Yellow, MA")
        

        【讨论】:

          猜你喜欢
          • 2023-03-20
          • 2014-09-16
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2011-07-10
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多