用于 alpha 的 Stringr 正则表达式，包括重音答案

【问题标题】：Stringr regex for alpha including accents用于 alpha 的 Stringr 正则表达式，包括重音
【发布时间】：2020-07-17 23:52:23
【问题描述】：

string = "2001 - l'odyssée de l'espace"

用str_exctract() 仅提取“l'odyssée de l'espace”的正则表达式是什么？

str_extract_all(string, '[^-[:digit:]]') 可以工作，但不能将它重新连接在一起。

【问题讨论】：

[^-[:digit:]] 可能会工作（使用trimws 清理）。
它返回一个带有 str_extract() 和 " " " " "l" "'" "o" "d" "y" "s" "s" "é" "e" " "的空格"d" "e" " " "l" "'" "e" "s" "p" "a" "c" "e" with str_extract_all()
抱歉，我正在查看初学者正则表达式，这不是您所需要的 stringr::str_extract
使用sub("^\\d+\\s*-\\s*", "", string)。 trimws(gsub("[-[:digit:]]", "", string)) 将删除- 和部分中初始数字+空格/连字符之后的任何数字（stringr::str_extract_all(string, "[^-[:digit:]]+") 有同样的问题）。

【解决方案1】：

这是一个基本的 R 方法：

trimws(gsub("[-[:digit:]]", "", string))
# [1] "l'odyssée de l'espace"

一个不完美的stringr提取：

stringr::str_extract_all(string, "[^-[:digit:]]+")
# [[1]]
# [1] " "                      " l'odyssée de l'espace"

可以扩展

grep("\\S", stringr::str_extract_all(string, "[^-[:digit:]]+", simplify = TRUE), value = TRUE)
# [1] " l'odyssée de l'espace"

【讨论】：

【解决方案2】：

另一种方法，提取所需内容：

sub('\\d+\\s*-\\s*(.*)', '\\1', string)
#[1] "l'odyssée de l'espace"

【讨论】：