将来自符合模式并忽略 NA 的名称的字符串粘贴在一起答案

【问题标题】：Paste together strings which come from names that fit a pattern and ignore NAs将来自符合模式并忽略 NA 的名称的字符串粘贴在一起
【发布时间】：2023-04-04 17:58:01
【问题描述】：

我正在尝试在一个 tibble 中创建一个新列，它是几个字符串列的串联。这些列的名称都符合一个模式......特别是，它们都以相同的子字符串开头。我正在尝试选择内部和外部mutate、paste、str_c 和unite 中的每一个组合，但无济于事。

代表：

library(tibble); library(dplyr)
df <- tibble(
    include1 = c("a", "b", "c"),
    include2 = c("d", "e", NA),
    include3 = c("f", "g", "h"),
    include4 = c("i", NA, NA),
    ignore = c("j", "k", "l")
    )

df
# A tibble: 3 x 5
  include1 include2 include3 include4 ignore
  <chr>    <chr>    <chr>    <chr>    <chr> 
1 a        d        f        i        j     
2 b        e        g        NA       k     
3 c        NA       h        NA       l

我正在尝试看起来像以下变体的代码：

df %>% 
    mutate(included = str_c(starts_with("include"), " | ", na.rm = TRUE)) %>% 
    select(ignore, included)

预期输出：

# A tibble: 3 x 2
  ignore included     
  <chr>  <chr>        
1 j      a | d | f | i
2 k      b | e | g    
3 l      c | h

我怎样才能做到这一点？

【问题讨论】：

相关：suppress NAs in paste()
这篇文章对你的问题有很多类似的建议 - stackoverflow.com/questions/52712390/…

标签： r string dplyr na paste

【解决方案1】：

你可以这样做：

library(dplyr)
library(purrr)

df %>%
  transmute(ignore, 
            included = pmap_chr(df %>% select(-ignore), ~ paste(na.omit(c(...)), collapse = " | ")))

# A tibble: 3 x 2
  ignore included     
  <chr>  <chr>        
1 j      a | d | f | i
2 k      b | e | g    
3 l      c | h

【讨论】：

你能分解并解释你的答案吗？ purrr 语法很特别。
这适用于我的最小示例，我认为我做的太小了。在我的真实数据中，还有很多列需要忽略，所以select(-ignore) 也不能正常工作。

【解决方案2】：

我们可以使用unite 和na.rm

library(dplyr)
library(tidyr)
df %>%
    unite(included, starts_with('include'), na.rm = TRUE, sep = "| ") %>%
   select(ignore, included)

-输出

# A tibble: 3 x 2
#  ignore included  
#  <chr>  <chr>     
#1 j      a| d| f| i
#2 k      b| e| g   
#3 l      c| h

【讨论】：