【发布时间】:2019-02-16 13:30:41
【问题描述】:
我想使用正则表达式从文本中删除所有以大写字母开头并满足以下两个条件的单词:
1) 它们后面只能跟小写字母或“'s”(所有格)或标点符号(.,?!)。
2) 它们不在“.”、“!”之后。和“?”
我试过了
import re
myString='The name of her company is Water Company WC 123 WaTerCompany! She was going to meet Daniel. Why? Because Daniel is her boy friend. Patricia? The daughter of Susana! Look, Daniel\'s car is white'
regex='([A-Z][a-z\']*)(\s[A-Z][a-z\']*)*'
txt = re.sub(regex, " ", myString)
我得到了
name of her company is 123 ! was going to meet . ? is her boy friend. ? daughter of ! , car is white
我想要
name of her company is WC 123 WaTerCompany! She was going to meet . Why? Because is her boy friend. Patricia? The daughter of ! Look, car is white
【问题讨论】:
-
为什么
Patricia在您的预期输出中被删除?它是紧跟在.之后的一个大写单词。 -
你是对的。对不起!已编辑!
-
还有一个小问题:
Look之后的,也不应该被删除。 -
嗯,有一种方法可以支持单词前任意数量的空格。
-
检查this demo。
标签: regex python-3.x