在字符串对象中查找模式，然后提取子字符串答案

【问题标题】：Find a pattern in a string object then extract substring在字符串对象中查找模式，然后提取子字符串
【发布时间】：2018-07-08 08:43:29
【问题描述】：

我想从一个字符串对象中提取一个子字符串。我要提取的文本是末尾带有 € 的价格数据。价格可以是 3 位数或 4 位数。

text = "xxxxxx; AAAA€; xxxxxxx"

或

text = "xxxxxx; AAA€; xxxxxxx"

我的代码：

position = text.find("€")
price_to_clean = text[(position - 4):(position - 1)]
price = price_to_clean.rpartition(";")[-1]

我的想法是搜索到 €，然后反向提取 4 位数字（子字符串将是“AAAA€”或“;AAA€”）。然后从后面的分号中删除分号。我想知道是否有更好的方法来实现这一点。例如。找到€然后反向搜索直到分号？

【问题讨论】：

你考虑过正则表达式吗？
碎片总是这样用分号分隔吗？价格总是第二吗？如果是这样，很难比text.split('; ')[1][:-1] 更简单。尽管您似乎正在尝试解析 CSV 文件中的行，但在这种情况下，您可能想要使用 csv 模块，或者在某些时候您会遇到像 "This column has ;;; semicolons so it's quoted" 这样的列，或者，更糟糕的是，This column is plain old text but it happens to have 555€ in it。
@Mr.T 不，我没有。我刚刚尝试了 re.search 并且它起作用了。谢谢！谢谢！
@abarnet 感谢您花时间研究我的问题。不，它是来自 html 文件的字符串对象。各种标志都很乱。我无法将其转换为任何类型的文件。

【解决方案1】：

使用正则表达式。 re.search

例如：

import re
text = "xxxxxx; 1000€; xxxxxxx"
m = re.search("(?P<price>\d+€)", text)
if m:
    print(m.group('price'))

输出：

1000€

【讨论】：