在 tidyr 和 dplyr 中按模式（单词）分隔答案

【问题标题】：separate by pattern (word) in tidyr and dplyr在 tidyr 和 dplyr 中按模式（单词）分隔
【发布时间】：2021-06-16 04:33:01
【问题描述】：

我有一个非常简单的需求：在 dplyr 管道链中将一列拆分为两个新列。这里的技巧是使用特定的单词作为分隔符而不是单个字符。

数据：

id    elements
1     banana and apple
2     orange and lemon
3     house and flat

预期结果

id    element1    element2
1      banana      apple
2      orange      lemon
3      house       flat

显然，tidyr::separate 方法没有按预期工作（我的错）。用“and”的首字母分隔。

df %>% tidyr::separate(elements, into = c("element1","element2"), sep = "and")

我知道这也许可以用其他动词来实现，但我的主要目标是尽可能使用 dplyr 和 tidyr。

【问题讨论】：

你能dput你的数据吗？
@Forge 当你说没有按预期工作时不清楚？我得到正确的输出。只是为了删除空格，我添加了\\s*。当你使用单独的时候，你能显示你的输出吗
它以“and”中的第一个字母分隔。 “一”

标签： r string dplyr tidyr

【解决方案1】：

我们可以指定前后的空格，也可以去掉

library(dplyr)
library(tidyr)
df %>%
   separate(elements, into = c('element1', 'element2'),
          sep = '\\s*and\\s*')

-输出

#  id element1 element2
#1  1   banana    apple
#2  2   orange    lemon
#3  3    house     flat

数据

df <- structure(list(id = 1:3, elements = c("banana and apple", 
"orange and lemon", 
"house and flat")), class = "data.frame", row.names = c(NA, -3L
))

【讨论】：