使用正则表达式读取特定格式

【问题标题】：Using regular expressions to read specific format使用正则表达式读取特定格式
【发布时间】：2018-10-30 13:28:07
【问题描述】：

拥有以下注册表：

str = "RegName1,Regname2,0x00000000,0x100000"

我想使用正则表达式来获取这些值。

我尝试了这个，但它不起作用。

re.match('\s+,\s+,\d+,\d+',str)

注意：我想忽略 "//"、"#" 等 cmets。

【问题讨论】：

re.match 中的正则表达式是第一个参数。
对不起，我修好了，但还是不行@khelwood
\s 表示空格。也许你的意思是\w（单词字符）
另外，0x00000000 与 \d+ 不匹配，因为 x 不是数字。

标签： python python-3.x

【解决方案1】：

以这种方式识别您的字段：

import re
text = "RegName1,Regname2,0x00000000,0x100000"
matches = re.match(r"(\w+),(\w+),(\w+),(\w+)", text)

那么，如果你 print(matches) 和 print(matches[1])，你会得到：

Python 3.6.1 (default, Dec 2015, 13:05:11)
[GCC 4.8.2] on linux

<_sre.SRE_Match object; span=(0, 37), match='RegName1,Regname2,0x00000000,0x100000'>
RegName1

不确定如何避免注释行。但是，至少它们不会被正则表达式匹配。例如：

text = "#Some comment"
matches = re.match(r"(\w+),(\w+),(\w+),(\w+)", text)
print(matches)

投掷：

Python 3.6.1 (default, Dec 2015, 13:05:11)
[GCC 4.8.2] on linux

None

如果你这样做：

if matches is not None:
    do something

..他们将被避免。

【讨论】：