Python正则表达式查找与其他单词分隔的单词答案

【问题标题】：Python regex finding words separated with other wordsPython正则表达式查找与其他单词分隔的单词
【发布时间】：2015-07-12 21:29:00
【问题描述】：

有没有办法使用re.findall 或其他正则表达式方法以指定顺序计算单词出现的次数，由任意数量的单词分隔？

这是一个“蛮力”实现：

def search_query(query, page):
    count=i=0
    for word in page.split():
            if word == query[i]: i+=1
            if i==len(query): 
                count+=1
                break
    print count

search_query(['hello','kilojoules'],'hello my good friend kilojoules')
1

例如，当查询为hello kilojoules 时，我想将hello my good friend kilojoules 识别为我的查询实例，但不计入kilojoules is my good friend。

这是我对令人满意的正则表达式的天真尝试：re.findall('hello\s\Skilojoules','hello my friend kilojoules')。这行不通。我认为它会起作用，因为我对这句话的理解是“查找hello 和kilojoules 的所有实例，以空格或空格分隔”。

【问题讨论】：

可能是(?s)\bhello\b.*?\bkilojoules\b？请注意，\s\S 只是一个空格，后跟一个非空格。 hello\s\Skilojoules 可以匹配hello bkilojoules，但不能匹配hello kilojoules。
在此处使用raw strings 时一般会喃喃自语，除非您出于某种原因喜欢输入反斜杠按钮，
@stribizhev re.findall('(?s)\bhello\b.*?\bkilojoules\b','hello my amigo kilojoules') 什么都不返回

标签： python regex findall

【解决方案1】：

按照 stribizhev 的建议，我在 re.findall('hello.*?kilojoules','a happy hello my amigo kilojoules now goodbye') 找到了成功

【讨论】：

【解决方案2】：

让我澄清一下：

(?s)\bhello\b.*?\bkilojoules\b

这个正则表达式的意思是*匹配整个单词hello，然后是任何字符，甚至是空格和换行符，然后是整个单词kilojoules。

如果您没有换行符，并且不关心整个单词匹配，请使用

hello.*?kilojoules

请注意，\s\S 只是一个空格，后跟一个非空格。因此，hello\s\Skilojoules 可以匹配hello bkilojoules，但不能匹配hello kilojoules。

【讨论】：