Python正则表达式匹配空格但不匹配换行符答案

【问题标题】：Python Regular Expression to match space but not newlinePython正则表达式匹配空格但不匹配换行符
【发布时间】：2019-01-12 05:10:52
【问题描述】：

我有一个这样的字符串：

'\n479 Appendix I\n1114\nAppendix I 481\n'

又想用正则表达式查找并返回

['479 Appendix I', 'Appendix I 481']

我首先尝试了这个表达式：

pattern = r'''
(?: \d+ \s)? Appendix \s+ \w+ (?: \s \d+)?
'''

regex = re.compile(pattern, re.VERBOSE)

regex.findall(s)

但这会返回

['479 Appendix I\n1114', 'Appendix I 481']

因为\s 也匹配\n。按照这篇文章Python regex match space only 中的一个答案，我尝试了以下方法：

pattern = r'''
(?: \d+ [^ \S\t\n])? Appendix \s+ \w+ (?: [^ \S\t\n] \d+)?
'''

regex = re.compile(pattern, re.VERBOSE)

regex.findall(s)

然而没有返回想要的结果，给出：

['Appendix I', 'Appendix I']

在这种情况下，什么表达式会起作用？

【问题讨论】：

【解决方案1】：

这个正则表达式比另一个答案中的更健壮一点，因为它明确地锚定在“附录”：

pattern = '(?:\d*[\t ]+)?Appendix\s+\w+(?:[\t ]+\d*)?'
re.findall(pattern, s)
#['479 Appendix I', 'Appendix I 481']

【讨论】：

【解决方案2】：

import re

s = '\n479 Appendix I\n1114\nAppendix I 481\n'

for g in re.findall(r'^.*[^\d\n].*$', s, flags=re.M):
    print(g)

打印：

479 Appendix I
Appendix I 481

此正则表达式将匹配至少包含一个不同于数字或换行符的字符的所有行。 this regex here的解释。

【讨论】：