使用特定字符从输入中提取多个子字符串以查找它们答案

【问题标题】：Pull several substrings from an input using specific characters to find them使用特定字符从输入中提取多个子字符串以查找它们
【发布时间】：2022-01-12 03:37:56
【问题描述】：

我需要创建一个用户创建的 madlib，用户可以在其中输入一个 madlib 供其他人使用。输入将是这样的：

The (^noun^) and the (^adj^) (^noun^)

我需要在 (^ 和 ^) 之间提取任何内容，这样我就可以使用这个词来编码，这样我就会得到另一个输入提示来完成 madlib。

input('Enter "word in-between the characters":')

这是我现在的代码

madlib = input("Enter (^madlib^):")
a = "(^"
b = "^)"
start = madlib.find(a) + len(a)
end = madlib.find(b)
substring = madlib[start:end]
def mad():
   if "(^" in madlib:
      substring = madlib[start:end]
      m = input("Enter " + substring + ":")
      mad = madlib.replace(madlib[start:end],m)
   return mad
print(mad())

我错过了什么？

【问题讨论】：

有什么问题？

标签： python string character

【解决方案1】：

您可以通过收集每场比赛的.span() 来相当干净地使用re.finditer()！

import re

# collect starting madlib
madlib_base = input('Enter madlib base with (^...^) around words like (^adj^)): ')

# list to put the collected blocks of spans and user inputs into
replacements = []

# yield every block like (^something^) by matching each end and `not ^` inbetween
for match in re.finditer(r"\(\^([^\^]+)\^\)", madlib_base):
    replacements.append({
        "span": match.span(),  # position of the match in madlib_base
        "sub_str": input(f"enter a {match.group(1)}: "),  # replacement str
    })

# replacements mapping and madlib_base can be saved for later!

def iter_replace(base_str, replacements_mapping):
    # yield alternating blocks of text and replacement
    # skip the replacement span from the text when yielding
    base_index = 0  # index in base str to begin from
    for block in replacements_mapping:
        head, tail = block["span"]       # unpack span
        yield base_str[base_index:head]  # next value up to span
        yield block["sub_str"]           # string the user gave us
        base_index = tail                # start from the end of the span

# collect the iterable into a single result string
# this can be done at the same time as the earlier loop if the input is known
result = "".join(iter_replace(madlib_base, replacements))

示范

...
enter a noun: Madlibs
enter a adj: rapidly
enter a noun: house
...
>>> result
'The Madlibs and the rapidly house'
>>> replacements
[{'span': (4, 12), 'sub_str': 'Madlibs'}, {'span': (21, 28), 'sub_str': 'rapidly'}, {'span': (29, 37), 'sub_str': 'house'}]
>>> madlib_base
'The (^noun^) and the (^adj^) (^noun^)'

【讨论】：

哇，谢谢，这对你有很大帮助！

【解决方案2】：

您的mad() 函数只进行一次替换，并且只调用一次。对于具有三个必需替换的示例输入，您只会得到第一个 noun。此外，mad() 依赖于在函数外部初始化的值，因此多次调用它是行不通的（它将继续尝试对同一个 substring 进行操作，等等）。

要修复它，您需要使mad() 对您提供的任何文本进行一次替换，而不管函数之外的任何其他状态；那么你需要调用它，直到它取代了所有的单词。您可以通过让mad 返回一个标志来表明它是否找到了可以替代的东西，从而使这更容易。

def mad(text):
    start = text.find("(^")
    end = text.find("^)")
    substring = text[start+2:end] if start > -1 and end > start else ""
    if substring:
        m = input(f"Enter {substring}: ")
        return text.replace(f"(^{substring}^)", m, 1), True
    return text, False


madlib, do_mad = input("Enter (^madlib^):"), True
while do_mad:
    madlib, do_mad = mad(madlib)

print(madlib)

Enter (^madlib^):The (^noun^) and the (^adj^) (^noun^)
Enter noun: cat
Enter adj: lazy
Enter noun: dog
The cat and the lazy dog

【讨论】：