【问题标题】:match a regular expression with optional lookahead匹配带有可选前瞻的正则表达式
【发布时间】:2016-01-10 12:19:51
【问题描述】:
我有以下字符串:
NAME John Nash FROM California
NAME John Nash
我想要一个能够为两个字符串提取“John Nash”的正则表达式。
这是我尝试过的
"NAME(.*)(?:FROM)"
"NAME(.*)(?:FROM)?"
"NAME(.*?)(?:FROM)?"
但这些都不适用于两个字符串。
【问题讨论】:
标签:
python
regex
regex-lookarounds
regex-greedy
【解决方案2】:
r'^\w+\s+(\w+\s+\w+) - word at start of string
follows by one or more spaces and
two words and at least one space between them
with open('data', 'r') as f:
for line in f:
mo = re.search(r'^\w+\s+(\w+\s+\w+)',line)
if mo:
print(mo.group(1))
John Nash
John Nash
【解决方案3】:
将字符串的第二部分设为可选(?: FROM.*?)?,即:
NAME (.*?)(?: FROM.*?)?$
MATCH 1
1. [5-14] `John Nash`
MATCH 2
1. [37-46] `John Nash`
MATCH 3
1. [53-66] `John Doe Nash`
正则表达式演示
https://regex101.com/r/bL7kI2/2
【解决方案4】:
你可以不用正则表达式:
>>> myStr = "NAME John Nash FROM California"
>>> myStr.split("FROM")[0].replace("NAME","").strip()
'John Nash'