在Python中匹配字符串中的确切短语答案

【问题标题】：Match exact phrase within a string in Python在Python中匹配字符串中的确切短语
【发布时间】：2017-12-06 19:08:41
【问题描述】：

我正在尝试确定子字符串是否在字符串中。我遇到的问题是，如果在字符串的另一个单词中找到子字符串，我不希望我的函数返回 True。

例如：如果子字符串是； “紫牛” 字符串是； “紫牛是最好的宠物。” 这应该返回 False。由于 cow 在子字符串中不是复数。

如果子字符串是； “紫牛” 字符串是； “你的紫牛践踏了我的树篱！” 将返回 True

我的代码如下所示：

def is_phrase_in(phrase, text):
    phrase = phrase.lower()
    text = text.lower()

    return phrase in text


text = "Purple cows make the best pets!"
phrase = "Purple cow"
print(is_phrase_in(phrase, text)

在我的实际代码中，我在将“文本”与短语进行比较之前清理了不必要的标点符号和空格，但除此之外是相同的。我尝试过使用 re.search，但我还不太了解正则表达式，并且只从它们中获得了与我的示例相同的功能。

感谢您提供的任何帮助！

【问题讨论】：

感谢 Jaques 的编辑！没有注意到我离开了那个自己。在那里。
谢谢大家的回复！

标签： python string python-3.x match

【解决方案1】：

由于您的短语可以包含多个单词，因此进行简单的拆分和相交是行不通的。我会为此使用正则表达式：

import re

def is_phrase_in(phrase, text):
    return re.search(r"\b{}\b".format(phrase), text, re.IGNORECASE) is not None

phrase = "Purple cow"

print(is_phrase_in(phrase, "Purple cows make the best pets!"))   # False
print(is_phrase_in(phrase, "Your purple cow trampled my hedge!"))  # True

【讨论】：

谢谢，太好了！看起来我几乎自己想通了。我尝试了一段时间来弄清楚如何将我的变量“短语”放入 re.search 并且从未想过使用字符串格式。是时候自学正则表达式了！
@zwer - 为什么在正则表达式模式中包含\b？
@Jubbles - 确保正在搜索的短语周围有一个单词边界，否则它也会捕获部分短语匹配（即 purple cows 而不仅仅是紫牛）。

【解决方案2】：

使用 PyParsing：

import pyparsing as pp

def is_phrase_in(phrase, text):
    phrase = phrase.lower()
    text = text.lower()

    rule = pp.ZeroOrMore(pp.Keyword(phrase))
    for t, s, e in rule.scanString(text):
      if t:
        return True
    return False

text = "Your purple cow trampled my hedge!"
phrase = "Purple cow"
print(is_phrase_in(phrase, text))

产量：

True

【讨论】：

【解决方案3】：

一个循环就可以做到这一点

phrase = phrase.lower()
text = text.lower()

answer = False 
j = 0
for i in range(len(text)):
    if j == len(phrase):
        return text[i] == " "
    if phrase[j] == text[i]:
        answer = True
        j+=1
    else:
        j = 0 
        answer = False 
return answer

或者通过拆分

phrase_words = phrase.lower().split()
text_words = text.lower().split()

return phrase_words in text_words

或使用正则表达式

import re
pattern = re.compile("[^\w]" + text + ""[^\w]")
pattern.match(phrase.lower())

表示我们不希望文本前后有任何字符，但可以使用空格。

【讨论】：

【解决方案4】：

正则表达式应该可以解决问题

import re

def is_phrase_in(phrase, text):
    phrase = phrase.lower()
    text = text.lower()
    if re.findall('\\b'+phrase+'\\b', text):
        found = True
    else:
        found = False
    return found

【讨论】：

【解决方案5】：

给你，希望对你有帮助

 # Declares
 string = "My name is Ramesh and I am cool. You are Ram ?"
 sub = "Ram"

 # Check String For SUb String
 result = sub in string

 # Condition Check
 if result:

    # find starting position
    start_position = string.index(sub)

    # get stringlength
    length = len(sub)

    # return string
    output = string[start_position:len]

【讨论】：

@Manpreet Singh