RegEx：逗号分隔的对列表答案

【问题标题】：RegEx: comma-separated list of pairsRegEx：逗号分隔的对列表
【发布时间】：2021-01-18 17:23:38
【问题描述】：

我有一个逗号分隔的“对”列表，如下所示：

<0,64000><1,207><2,460b0><3,38000><4,460b0><5,38000><6,460b0><7,38000><8,460b0><9,38000><a,460b0>

每个值都是十六进制的。我正在使用以下正则表达式来捕获每一对（在 Python 中）

\<[^\>]*\>

当我将<0,64000> 作为第一个匹配项，<1,207> 作为第二个匹配项时，这很有效，依此类推。

由于我对这些值感兴趣，所以我试图变得懒惰并避免操纵结果匹配以删除 < 和 >，所以我这样做了：

\<([^\>]*)\>

现在每个捕获组是0,64000、1,207。我想更进一步，尝试捕获每个数字而不是一对。关于如何使用单个正则表达式执行此操作的任何想法？

非常感谢！！

【问题讨论】：

使用re.findall(r'<([\da-f]+),([\da-f]+)>', text, re.I)

标签： python regex

【解决方案1】：

作为最简单的例子，你可以使用这个：

<([^,]+),([^>]+)>

这将捕获第一个 < 之后但逗号之前的所有内容作为第一个值，以及不包含 > 的所有内容作为第二个值。您还可以指定这些值必须是十六进制：

<([0-9a-f]+),([0-9a-f]+)>

【讨论】：

天啊，太简单了！好的，今天是星期一，我今天有点慢。非常感谢！

【解决方案2】：

如果你使用 PyPi regex:

import regex
string = "<0,64000><1,207><2,460b0><3,38000><4,460b0><5,38000><6,460b0><7,38000><8,460b0><9,38000><a,460b0>"
print(regex.findall(r'<([[:xdigit:]]+),([[:xdigit:]]+)>', string))

见Python proof。 [[:xdigit:]]+ = [0-9A-Fa-f]+。所以，它等于

regex.findall(r'<([0-9A-Fa-f]+),([0-9A-Fa-f]+)>', string)

结果：[('0', '64000'), ('1', '207'), ('2', '460b0'), ('3', '38000'), ('4', '460b0'), ('5', '38000'), ('6', '460b0'), ('7', '38000'), ('8', '460b0'), ('9', '38000'), ('a', '460b0')]

说明

--------------------------------------------------------------------------------
  <                        '<'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [[:xdigit:]]+            any character of: hexadecimal digits (a-
                             f, A-F, 0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  ,                        ','
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    [[:xdigit:]]+            any character of: hexadecimal digits (a-
                             f, A-F, 0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  >                        '>'

【讨论】：