【发布时间】:2016-12-06 21:26:06
【问题描述】:
如果一个特定的单词不以另一个特定的单词结尾,请留下它。这是我的字符串:
x = 'john got shot dead. john with his .... ? , john got killed or died in 1990. john with his wife dead or died'
我想打印并计算john 和dead or death or died. 之间的所有单词
如果john 不以任何died or dead or death 单词结尾。别管它。以 john word 重新开始。
我的代码:
x = re.sub(r'[^\w]', ' ', x) # removed all dots, commas, special symbols
for i in re.findall(r'(?<=john)' + '(.*?)' + '(?=dead|died|death)', x):
print i
print len([word for word in i.split()])
我的输出:
got shot
2
with his john got killed or
6
with his wife
3
我想要的输出:
got shot
2
got killed or
3
with his wife
3
我不知道我在哪里做错了。 它只是一个示例输入。我必须一次检查 20,000 个输入。
【问题讨论】:
-
您的观点不明确。由于
with his john got killed or出现在 john 之后,所以它算作 6? -
@MarlonAbeykoon
john with his .... ? , john got killed or died第一个john单词不以dead or death or died结尾。从第二个john字开始。我想要的输出是got killed or而不是with his john got killed or
标签: python regex python-2.7