【问题标题】：Python 3 Regex Last MatchPython 3 正则表达式最后匹配
【发布时间】：2013-06-04 11:19:33
【问题描述】：

如何使用 Python 3 正则表达式模块获取以下字符串的 123 部分？

....XX (a lot of HTML characters)123

这里的... 部分表示由 HTML 字符、单词和数字组成的长字符串。

数字123 是XX 的一个特征。因此，如果有人可以提出一种通用方法，其中XX 可以是AA 或AB 等任何字母，那会更有帮助。

旁注：
我想通过首先识别字符串中的XX，然后识别出现在XX 之后的第一个数字来使用Perl 的\G 运算符。但似乎\G 运算符在 Python 3 中不起作用。

我的代码：

import re
source='abcd XX blah blah 123 more blah blah'
grade=str(input('Which grade?'))
#here the user inputs XX

match=re.search(grade,source)
match=re.search('\G\D+',source)
#Trying to use the \G operator to get the location of last match.Doesn't work.

match=re.search('\G\d+',source)
#Trying to get the next number after XX.
print(match.group())

【问题讨论】：

您能否展示一下您的尝试，让这个问题变得更加清晰
“抢”是什么意思？ if '123' in text: print '123' 怎么样？
stackoverflow.com/questions/2802168/…
可以指定起始位置。匹配=重新搜索（等级，来源）； match = re.compile(r'\d+').search(source, match.end());打印（match.group（））
编译正则表达式的搜索方法接受可选的pos参数。 docs.python.org/2/library/re.html#re.RegexObject.search

标签： python regex string parsing python-3.x

【解决方案1】：

说明

这个正则表达式将匹配字符串值XX，它可以被用户输入替换。正则表达式还要求XX 字符串被空格包围或在示例文本的开头，以防止在XX 出现在EXXON 之类的单词中的意外边缘情况。

(?<=\s|^)\b(xx)\b\s.*?\s\b(\d+)\b(?=\s|$)

代码示例：

我对 python 的了解不够，无法提供合适的 python 示例，所以我包含一个 PHP 示例来简单地展示正则表达式如何工作以及捕获的组

<?php
$sourcestring="EXXON abcd XX blah blah 123 more blah blah";
preg_match('/(?<=\s|^)\b(xx)\b\s.*?\s\b(\d+)\b(?=\s|$)/im',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
 
$matches Array:
(
    [0] => XX blah blah 123
    [1] => XX
    [2] => 123
)

如果您需要实际的字符串位置，那么在 PHP 中会是这样的

$position = strpos($sourcestring, $matches[0])

【讨论】：

只是好奇。你是用什么来生成图像的？
@Korylprince，我正在使用 debuggex.com。尽管它不支持lookbehinds 或原子组，但它对于理解表达式流仍然很方便。还有 regexper.com。它们也做得很好，但在您输入时并不是实时的。