python解析带有嵌套花括号的绑定配置答案

【问题标题】：python parse bind configuration with nested curly bracespython解析带有嵌套花括号的绑定配置
【发布时间】：2014-02-17 14:38:25
【问题描述】：

我正在尝试自动解析现有的绑定配置，由多个这些区域定义组成：

zone "domain.com" {
        type slave;
        file "sec/domain.com";
        masters {
                11.22.33.44;
                55.66.77.88;
        };
        allow-transfer {
             "acl1";
             "acl2";
        };
};

请注意masters 和allow-transfer 中的元素数量可能不同。我尝试使用re.split() 将其拆分，但由于嵌套的花括号而严重失败。

我的目标是为每个条目创建一个字典。

提前感谢您的帮助！

【问题讨论】：

Any python libs for parsing Bind zone files? 的可能重复项
这不是区域文件，这是绑定配置。
每个“区域”定义之间是什么？

标签： python regex parsing

【解决方案1】：

这应该可以解决问题，其中 'st' 是所有区域定义的字符串：

import re
zone_def = re.split('zone', st, re.DOTALL)
big_dict = {}
for zone in zone_def:
    if len(zone) > 0:
        zone_name = re.search('(".*?")', zone)
        sub_dicts = re.finditer('([\w]+) ({.*?})', zone, re.DOTALL)
        big_dict[zone_name.group(1)] = {}
        for sub_dict in sub_dicts:
            big_dict[zone_name.group(1)][sub_dict.group(1)] = sub_dict.group(2).replace(' ', '')
        sub_types = re.finditer('([\w]+) (.*?);', zone)
        for sub_type in sub_types:
            big_dict[zone_name.group(1)][sub_type.group(1)] = sub_type.group(2)

big_dict 然后将返回一个区域定义字典。每个区域定义都将域/url 作为其键。区域定义中的每个键/值都是一个字符串。

这是上面示例的输出：

{'"domain.com"': {'transfer': '{\n"acl1";\n"acl2";\n}', 'masters': '{\n11.22.33.44;\n55.66.77.88;\n}', 'type': 'slave', 'file': '"sec/domain.com"'}}

如果你有第二个相同的区域，这就是输出，键为“sssss.com”。

{'"sssss.com"': {'transfer': '{\n"acl1";\n"acl2";\n}', 'masters': '{\n11.22.33.44;\n55.66.77.88;\n}', 'type': 'slave', 'file': '"sec/domain.com"'},'"domain.com"': {'transfer': '{\n"acl1";\n"acl2";\n}', 'masters': '{\n11.22.33.44;\n55.66.77.88;\n}', 'type': 'slave', 'file': '"sec/domain.com"'}}

您必须进行一些进一步的剥离以使其更具可读性。

【讨论】：

【解决方案2】：

一种方法是（安装和）使用 regex 模块而不是 re 模块。问题是 re 模块无法处理未定义级别的嵌套括号：

#!/usr/bin/python
import regex
data = '''zone "domain.com" {
    type slave;
    file "sec/domain.com";
    masters {
        11.22.33.44; { toto { pouet } glups };
        55.66.77.88;
    };
    allow-transfer {
        "acl1";
        "acl2";
    };
};  '''

pattern = r'''(?V1xi)
(?:
    \G(?<!^)
  |
    zone \s (?<zone> "[^"]+" ) \s* {
) \s*
(?<key> \S+ ) \s+
(?<value> (?: ({ (?> [^{}]++ | (?4) )* }) | "[^"]+" | \w+ ) ; )
'''

matches = regex.finditer(pattern, data)

for m in matches:
    if m.group("zone"):
        print "\n" + m.group("zone")
    print m.group("key") + "\t" + m.group("value")

您可以通过以下链接找到有关此模块的更多信息：https://pypi.python.org/pypi/regex

【讨论】：