如果您想处理多个and 拆分器,那么您应该考虑使用PyPi regex 模块,它允许我们使用分支重置组,即(?!...),它提供子模式 在此构造的每个备选方案中声明的将从相同的索引开始。
(?|(\d*) *(\b[a-z]+(?: [a-z]+)*?)(?= and )|(?<= and )(\d*) *(\b[a-z]+(?: [a-z]+)*))
RegEx Demo
import regex
rx = regex.compile(r'(?|(\d*) *(\b[a-z]+(?: [a-z]+)*?)(?= and )|(?<= and )(\d*) *(\b[a-z]+(?: [a-z]+)*))', regex.I)
arr = ['2 Better Developers and 3 Testers', '5 Mechanics and chef', 'medic and 3 nurses', '5 foo', '5 Mechanics and 2 chefs and tester']
for s in arr: print (rx.findall(s), ':', s)
输出:
[('2', 'Better Developers'), ('3', 'Testers')] : 2 Better Developers and 3 Testers
[('5', 'Mechanics'), ('', 'chef')] : 5 Mechanics and chef
[('', 'medic'), ('3', 'nurses')] : medic and 3 nurses
[] : 5 foo
[('5', 'Mechanics'), ('2', 'chefs'), ('', 'tester')] : 5 Mechanics and 2 chefs and tester
较早的答案,根据原始问题发布,存在单个 and。
你可以使用这个正则表达式:
(\d*) *(\S+(?: \S+)*?) and (\d*) *(\S+(?: \S+)*)
这里我们匹配and,两边各有一个空格。在and 之前和之后,我们使用这个子模式进行匹配:
(\d*) *(\S+(?: \S+)*?)
匹配可选的 0+ 位开头,后跟 0 个或多个空格,后跟 1 个或多个由空格分隔的非空白字符串。
RegEx Demo
代码:
import re
arr = ['2 Better Developers and 3 Testers', '5 Mechanics and chef', 'medic and 3 nurses', '5 foo']
rx = re.compile(r'(\d*) *(\S+(?: \S+)*?) and (\d*) *(\S+(?: \S+)*)')
for s in arr: print (rx.findall(s))
输出:
[('2', 'Better Developers', '3', 'Testers')]
[('5', 'Mechanics', '', 'chef')]
[('', 'medic', '3', 'nurses')]
[]