【问题标题】:Getting all matching values from a column从列中获取所有匹配值
【发布时间】:2017-03-30 00:01:27
【问题描述】:

我有一个名为 toy 的数据框,如下所示:

    toy<- structure(list(id = 1:10, Name = c("A", "B", "C", "D", "E", "F", 
"G", "H", "A", "A"), Alt = c("X|Y|a", "O|P|dev", "A|W|are", "M|Q|G", 
"H|f|j|i_m|am", "L|E|B|i|j", "x|C|xx|yy", NA, NA, NA), Place = c(1L, 
4L, 8L, 12L, 13L, 8L, 3L, 1L, 1L, 1L)), .Names = c("id", "Name", 
"Alt", "Place"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-10L), spec = structure(list(cols = structure(list(id = structure(list(), class = c("collector_integer", 
"collector")), Name = structure(list(), class = c("collector_character", 
"collector")), Alt = structure(list(), class = c("collector_character", 
"collector")), Place = structure(list(), class = c("collector_integer", 
"collector"))), .Names = c("id", "Name", "Alt", "Place")), default = structure(list(), class = c("collector_guess", 
"collector"))), .Names = c("cols", "default"), class = "col_spec"))

我的目的是在Name 列中找到匹配的字符,该列也在Alt 列中。我使用dplyr 尝试了以下操作:

toy_sep<-toy %>% separate(Alt , into=LETTERS[1:5],sep="\\|",extra="merge",remove=FALSE) %>% gather(Alias_id,Alias,A:E) %>% mutate(Match=match(Alias,Name))

从这里输出匹配的任何地方看起来像:

matches<-toy_sep[complete.cases(toy_sep),]

它接近我想要的。但是问题是 match 返回第一个位置,而我想要所有匹配项。在示例中,1 在 A 的 matches 数据框的 Match 列中返回,但我想要所有 id。 A 的 id 为 9 和 10(来自 toy 数据帧中的 id 列)以及 1。感谢使用 base/data.table/dplyr 的任何帮助

添加所需的输出。请注意,右上角单元格上的数字不需要用“|”分隔. :

d_out<-structure(list(id = c(3L, 5L, 6L, 7L, 4L, 6L), Name = c("C", 
"E", "F", "G", "D", "F"), Alt = c("A|W|are", "H|f|j|i_m|am", 
"L|E|B|i|j", "x|C|xx|yy", "M|Q|G", "L|E|B|i|j"), Place = c(8L, 
13L, 8L, 3L, 12L, 8L), Alias_id = c("A", "A", "B", "B", "C", 
"C"), Alias = c("A", "H", "E", "C", "G", "B"), Match = c("1|9|10", 
"8", "5", "3", "7", "2")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L), .Names = c("id", "Name", "Alt", "Place", 
"Alias_id", "Alias", "Match"), spec = structure(list(cols = structure(list(
    id = structure(list(), class = c("collector_integer", "collector"
    )), Name = structure(list(), class = c("collector_character", 
    "collector")), Alt = structure(list(), class = c("collector_character", 
    "collector")), Place = structure(list(), class = c("collector_integer", 
    "collector")), Alias_id = structure(list(), class = c("collector_character", 
    "collector")), Alias = structure(list(), class = c("collector_character", 
    "collector")), Match = structure(list(), class = c("collector_character", 
    "collector"))), .Names = c("id", "Name", "Alt", "Place", 
"Alias_id", "Alias", "Match")), default = structure(list(), class = c("collector_guess", 
"collector"))), .Names = c("cols", "default"), class = "col_spec"))

【问题讨论】:

  • 能否请您包含此示例数据所需的输出?

标签: r match


【解决方案1】:

试试这个。

  toy_sep<-toy %>% 
  separate(Alt , 
           into=LETTERS[1:5],
           sep="\\|",
           extra="merge",
           remove=FALSE) %>% 
  gather(Alias_id,Alias,A:E) %>% 
  mutate(Match=apply(t(Alias),
                     2,
                     FUN = function(x){
                       ind=grep(x,toy$Name)
                       ifelse(!is.na(sum(ind))&length(ind) >= 1 , 
                              paste0(ind,collapse = "|"),
                              NA)
                       }
                     )
         )
  matches<-toy_sep[complete.cases(toy_sep),]

【讨论】:

    猜你喜欢
    • 2019-01-30
    • 1970-01-01
    • 2012-10-14
    • 1970-01-01
    • 2016-08-03
    • 2021-11-19
    • 2019-05-16
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多