【问题标题】:Search for multiple values in a column in R在R中的列中搜索多个值
【发布时间】:2019-01-20 14:01:06
【问题描述】:

我有一个包含两列的数据框:

df = data.frame(animals = c("cat; dog; bird", "dog; bird", "bird"), sentences = c("the cat is brown; the dog is barking; the bird is green and blue","the dog is black; the bird is yellow and blue", "the bird is blue"), stringsAsFactors = F)

我需要整个“句子”列中每一行所有“动物”出现的总和。

例如:“animals”第一行 c("cat;dog;bird") = sum_occurrences_sentences_column (cat = 1) + (dog = 2) + (bird = 3) = 6 。

结果将是这样的第三列:

df <- cbind( sum_accurrences_sentences_column = c("6", "5", "3"), df)

我尝试了以下代码,但它们不起作用。

df[str_split(df$animals, ";") %in% df$sentences, ]

str_count(df$sentences, str_split(df$animals, ";"))

任何帮助将不胜感激:)

【问题讨论】:

    标签: r stringr


    【解决方案1】:

    这是一个基本的R 解决方案:

    先把;gsub全部去掉,然后把句子列和unlist拆分成一个向量:

    split_sentence_column = unlist(strsplit(gsub(';','',df$sentences),' '))

    然后设置一个 for 循环,为每一行获取一个动物向量,用%in%检查动物列表中的哪个句子列动物,然后对所有TRUE案例求和。然后我们可以直接将它分配给一个新的 df 列:

    for(i in 1:nrow(df)){
      animals = unlist(strsplit(df$animals[i], '; '))
      df$sum_occurrences_sentences_column[i] = sum(split_sentence_column %in% animals)
    }
    
    > df
             animals                                                        sentences sum_occurrences_sentences_column
    1 cat; dog; bird the cat is brown; the dog is barking; the bird is green and blue                                6
    2      dog; bird                    the dog is black; the bird is yellow and blue                                5
    3           bird                                                 the bird is blue                                3
    
    

    【讨论】:

      【解决方案2】:

      一种map() 操作第一列中每个动物块的方法。

      library(tidyverse)
      string <- unlist(str_split(df$sentences, ";"))
      
      df %>% rowwise %>%
        mutate(SUM = str_split(animals, "; ", simplify = T) %>%
          map( ~ str_count(string, .)) %>%
          unlist %>% sum)
      
      #   animals        sentences                                           SUM
      #   <chr>          <chr>                                               <int>
      # 1 cat; dog; bird the cat is brown; the dog is barking; the bird...   6
      # 2 dog; bird      the dog is black; the bird is yellow and blue       5
      # 3 bird           the bird is blue                                    3
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2014-12-22
        • 1970-01-01
        相关资源
        最近更新 更多