【问题标题】:How to comment out a YAML section using ruamel.yaml?如何使用 ruamel.yaml 注释掉 YAML 部分?
【发布时间】:2017-05-07 16:37:26
【问题描述】:

最近我尝试使用ruamel.yaml 管理我的 docker-compose 服务配置(即docker-compose.yml)。

我需要在需要时注释掉服务块并取消注释。假设我有以下文件:

version: '2'
services:
    srv1:
        image: alpine
        container_name: srv1
        volumes:
            - some-volume:/some/path
    srv2:
        image: alpine
        container_name: srv2
        volumes_from:
            - some-volume
volumes:
    some-volume:

是否有一些解决方法可以注释掉 srv2 块?就像下面的输出:

version: '2'
services:
    srv1:
        image: alpine
        container_name: srv1
        volumes:
            - some-volume:/some/path
    #srv2:
    #    image: alpine
    #    container_name: srv2
    #    volumes_from:
    #        - some-volume
volumes:
    some-volume:

另外,有没有办法取消注释这个块?(假设我已经持有原来的srv2块,我只需要一个方法来删除这些注释行)

【问题讨论】:

    标签: python pyyaml ruamel.yaml


    【解决方案1】:

    如果 srv2 是 YAML 中所有映射的唯一键,那么“简单”的方法是遍历 de 行,测试 de 剥离版本的行是否以 srv2: 开头,注意前导空格的数量并注释掉该行和后面的行,直到您注意到前导空格相等或更少的行。这样做的好处是,除了简单快速之外,它还可以处理不规则的缩进(如您的示例:srv1 之前的 4 个位置和some-volume 之前的 6 个位置)。

    使用ruamel.yaml 也可以这样做,但不那么简单。您必须知道,当 round_trip_loading 时,ruamel.yaml 通常会将注释附加到已处理的最后一个结构(映射/序列),并且由于在您的示例中注释掉 srv1 的结果与 srv2 完全不同(即第一个键值对,如果被注释掉,则不同于所有其他键值对)。

    如果您将预期输出标准化为四个缩进位置并在srv1 之前添加注释以进行分析,请加载该注释,您可以搜索注释结束的位置:

    from ruamel.yaml.util import load_yaml_guess_indent
    
    yaml_str = """\
    version: '2'
    services:
        #a
        #b
        srv1:
            image: alpine
            container_name: srv1
            volumes:
              - some-volume:/some/path
        #srv2:
        #    image: alpine
        #    container_name: srv2
        #    volumes_from:
        #      - some-volume
    volumes:
        some-volume:
    """
    
    data, indent, block_seq_indent = load_yaml_guess_indent(yaml_str)
    print('indent', indent, block_seq_indent)
    
    c0 = data['services'].ca
    print('c0:', c0)
    c0_0 = c0.comment[1][0]
    print('c0_0:', repr(c0_0.value), c0_0.start_mark.column)
    
    c1 = data['services']['srv1']['volumes'].ca
    print('c1:', c1)
    c1_0 = c1.end[0]
    print('c1_0:', repr(c1_0.value), c1_0.start_mark.column)
    

    哪个打印:

    indent 4 2
    c0: Comment(comment=[None, [CommentToken(), CommentToken()]],
      items={})
    c0_0: '#a\n' 4
    c1: Comment(comment=[None, None],
      items={},
      end=[CommentToken(), CommentToken(), CommentToken(), CommentToken(), CommentToken()])
    c1_0: '#srv2:\n' 4
    

    所以你“只有”,如果你注释掉第一个键值对,你必须创建第一个类型注释 (c0),如果你注释掉任何其他键,你必须创建另一个 (c1) -值对。 startmarkStreamMark()(来自 ruamel/yaml/error.py),创建 cmets 时该实例的唯一重要属性是 column

    幸运的是,这比上面显示的要容易一些,因为没有必要将 cmets 附加到 volumes 值的“末尾”,将它们附加到 srv1 值的末尾具有相同的效果。

    在下面的comment_block 需要一个键列表,它是要被注释掉的元素的路径。

    import sys
    from copy import deepcopy
    from ruamel.yaml import round_trip_dump
    from ruamel.yaml.util import load_yaml_guess_indent
    from ruamel.yaml.error import StreamMark
    from ruamel.yaml.tokens import CommentToken
    
    
    yaml_str = """\
    version: '2'
    services:
        srv1:
            image: alpine
            container_name: srv1
            volumes:
              - some-volume:/some/path
        srv2:
            image: alpine
            container_name: srv2  # second container
            volumes_from:
              - some-volume
    volumes:
        some-volume:
    """
    
    
    def comment_block(d, key_index_list, ind, bsi):
        parent = d
        for ki in key_index_list[:-1]:
            parent = parent[ki]
        # don't just pop the value for key_index_list[-1] that way you lose comments
        # in the original YAML, instead deepcopy and delete what is not needed
        data = deepcopy(parent)
        keys = list(data.keys())
        found = False
        previous_key = None
        for key in keys:
            if key != key_index_list[-1]:
                if not found:
                    previous_key = key
                del data[key]
            else:
                found = True
        # now delete the key and its value
        del parent[key_index_list[-1]]
        if previous_key is None:
            if parent.ca.comment is None:
                parent.ca.comment = [None, []]
            comment_list = parent.ca.comment[1]
        else:
            comment_list = parent[previous_key].ca.end = []
            parent[previous_key].ca.comment = [None, None]
        # startmark can be the same for all lines, only column attribute is used
        start_mark = StreamMark(None, None, None, ind * (len(key_index_list) - 1))
        for line in round_trip_dump(data, indent=ind, block_seq_indent=bsi).splitlines(True):
            comment_list.append(CommentToken('#' + line, start_mark, None))
    
    for srv in ['srv1', 'srv2']:
        data, indent, block_seq_indent = load_yaml_guess_indent(yaml_str)
        comment_block(data, ['services', srv], ind=indent, bsi=block_seq_indent)
        round_trip_dump(data, sys.stdout,
                        indent=indent, block_seq_indent=block_seq_indent,
                        explicit_end=True,
        )
    

    哪个打印:

    version: '2'
    services:
        #srv1:
        #    image: alpine
        #    container_name: srv1
        #    volumes:
        #      - some-volume:/some/path
        srv2:
            image: alpine
            container_name: srv2  # second container
            volumes_from:
              - some-volume
    volumes:
        some-volume:
    ...
    version: '2'
    services:
        srv1:
            image: alpine
            container_name: srv1
            volumes:
              - some-volume:/some/path
        #srv2:
        #    image: alpine
        #    container_name: srv2      # second container
        #    volumes_from:
        #      - some-volume
    volumes:
        some-volume:
    ...
    

    explicit_end=True 不是必需的,这里使用它来自动区分两个 YAML 转储)。

    也可以通过这种方式移除 cmets。递归搜索注释属性 (.ca) 以查找已注释掉的候选对象(可能会提示从何处开始)。从 cmets 中去除前导 # 并连接,然后是 round_trip_load。根据 cmets 的列,您可以确定在何处附加未注释的键值对。

    【讨论】:

    • 我的示例输出严格缩进 4 个空格,奇怪的是为什么它会在浏览器中打印 6 个空格。
    • @cherrot 不是,some-volume: 之前有 6 个缩进,其中破折号偏移为 4(即块序列缩进)。这当然是你的计数方式,但是像- a 这样的序列元素被计算为缩进 2,偏移量为 0。也就是说,some-volumesvolumesv 远 6 列,这算作 6 个缩进
    • @cherrot 这不是我想出来的,它是 PyYAML 对不计破折号的映射和序列只有一个“缩进”控件的结果。我曾经考虑将其拆分为 ruamel.yaml 的两个参数,但遇到了多个问题。添加block-sequence-indent 是我目前能做的最好的事情。
    • 我明白了。感谢您的解释@anthon!
    【解决方案2】:

    添加受@Anthon 回答启发的uncomment_block 函数,以及comment_block 的一些增强功能:

    from copy import deepcopy
    from ruamel.yaml import round_trip_dump, round_trip_load
    from ruamel.yaml.error import StreamMark
    from ruamel.yaml.tokens import CommentToken
    
    
    def comment_block(root, key_hierarchy_list, indent, seq_indent):
        found = False
        comment_key = key_hierarchy_list[-1]
        parent = root
        for ki in key_hierarchy_list[:-1]:
            parent = parent[ki]
        # don't just pop the value for key_hierarchy_list[-1] that way you lose comments
        # in the original YAML, instead deepcopy and delete what is not needed
        block_2b_commented = deepcopy(parent)
        previous_key = None
        for key in parent.keys():
            if key == comment_key:
                found = True
            else:
                if not found:
                    previous_key = key
                del block_2b_commented[key]
    
        # now delete the key and its value, but preserve its preceding comments
        preceding_comments = parent.ca.items.get(comment_key, [None, None, None, None])[1]
        del parent[comment_key]
    
        if previous_key is None:
            if parent.ca.comment is None:
                parent.ca.comment = [None, []]
            comment_list = parent.ca.comment[1]
        else:
            comment_list = parent[previous_key].ca.end = []
            parent[previous_key].ca.comment = [None, None]
    
        if preceding_comments is not None:
            comment_list.extend(preceding_comments)
    
        # startmark can be the same for all lines, only column attribute is used
        start_mark = StreamMark(None, None, None, indent * (len(key_hierarchy_list) - 1))
        skip = True
        for line in round_trip_dump(block_2b_commented, indent=indent, block_seq_indent=seq_indent).splitlines(True):
            if skip:
                if not line.startswith(comment_key + ':'):
                    continue
                skip = False
            comment_list.append(CommentToken('#' + line, start_mark, None))
    
        return False
    
    
    def uncomment_block(root, key_hierarchy_list, indent, seq_indent):
        '''
        FIXME: comments may be attached to the parent's neighbour
        in document like the following. (srv2 block is attached by volumes, not servies, not srv1).
        version: '2'
           services:
               srv1: foobar
               #srv2:
               #    image: alpine
               #    container_name: srv2
               #    volumes_from:
               #        - some-volume
           volumes:
               some-volume:
        '''
        found = False
        parent = root
        commented_key = key_hierarchy_list[-1]
        comment_indent = indent * (len(key_hierarchy_list) - 1)
        for ki in key_hierarchy_list[:-1]:
            parent = parent[ki]
    
        if parent.ca.comment is not None:
            comment_list = parent.ca.comment[1]
            found, start, stop = _locate_comment_boundary(comment_list, commented_key, comment_indent)
    
        if not found:
            for key in parent.keys():
                bro = parent[key]
                while hasattr(bro, 'keys') and bro.keys():
                    bro = bro[bro.keys()[-1]]
    
                if not hasattr(bro, 'ca'):
                    continue
    
                comment_list = bro.ca.end
                found, start, stop = _locate_comment_boundary(comment_list, commented_key, comment_indent)
    
        if found:
            block_str = u''
            commented = comment_list[start:stop]
            for ctoken in commented:
                block_str += ctoken.value.replace('#', '', 1)
            del(comment_list[start:stop])
    
            block = round_trip_load(block_str)
            parent.update(block)
        return found
    
    
    def _locate_comment_boundary(comment_list, commented_key, comment_indent):
        found = False
        start_idx = 0
        stop_idx = len(comment_list)
        for idx, ctoken in enumerate(comment_list):
            if not found:
                if ctoken.start_mark.column == comment_indent\
                        and ctoken.value.replace('#', '', 1).startswith(commented_key):
                    found = True
                    start_idx = idx
            elif ctoken.start_mark.column != comment_indent:
                stop_idx = idx
                break
        return found, start_idx, stop_idx
    
    
    if __name__ == "__main__":
        import sys
        from ruamel.yaml.util import load_yaml_guess_indent
    
        yaml_str = """\
    version: '2'
    services:
        # 1 indent after services
        srv1:
            image: alpine
            container_name: srv1
            volumes:
              - some-volume
            # some comments
        srv2:
            image: alpine
            container_name: srv2  # second container
            volumes_from:
              - some-volume
            # 2 indent after srv2 volume
    # 0 indent before volumes
    volumes:
        some-volume:
    """
    
        for srv in ['srv1', 'srv2']:
            # Comment a service block
            yml, indent, block_seq_indent = load_yaml_guess_indent(yaml_str)
            comment_block(yml, ['services', srv], indent=indent, seq_indent=block_seq_indent)
            commented = round_trip_dump(
                yml, indent=indent, block_seq_indent=block_seq_indent, explicit_end=True,
            )
            print(commented)
    
            # Now uncomment it
            yml, indent, block_seq_indent = load_yaml_guess_indent(commented)
            uncomment_block(yml, ['services', srv], indent=indent, seq_indent=block_seq_indent)
    
            round_trip_dump(
                yml, sys.stdout, indent=indent, block_seq_indent=block_seq_indent, explicit_end=True,
            )
    

    输出:

    version: '2'
    services:
        # 1 indent after services
        #srv1:
        #    image: alpine
        #    container_name: srv1
        #    volumes:
        #      - some-volume
        #        # some comments
        srv2:
            image: alpine
            container_name: srv2  # second container
            volumes_from:
              - some-volume
            # 2 indent after srv2 volume
    # 0 indent before volumes
    volumes:
        some-volume:
    ...
    
    version: '2'
    services:
        # 1 indent after services
        srv2:
            image: alpine
            container_name: srv2  # second container
            volumes_from:
              - some-volume
            # 2 indent after srv2 volume
    # 0 indent before volumes
        srv1:
            image: alpine
            container_name: srv1
            volumes:
              - some-volume
            # some comments
    volumes:
        some-volume:
    ...
    version: '2'
    services:
        # 1 indent after services
        srv1:
            image: alpine
            container_name: srv1
            volumes:
              - some-volume
            # some comments
        #srv2:
        #    image: alpine
        #    container_name: srv2      # second container
        #    volumes_from:
        #      - some-volume
        #        # 2 indent after srv2 volume
        ## 0 indent before volumes
    volumes:
        some-volume:
    ...
    
    version: '2'
    services:
        # 1 indent after services
        srv1:
            image: alpine
            container_name: srv1
            volumes:
              - some-volume
            # some comments
        srv2:
            image: alpine
            container_name: srv2  # second container
            volumes_from:
              - some-volume
            # 2 indent after srv2 volume
    # 0 indent before volumes
    volumes:
        some-volume:
    ...
    

    【讨论】:

      猜你喜欢
      • 2021-12-24
      • 2017-04-03
      • 2016-11-10
      • 2012-01-24
      • 2021-06-25
      • 2021-06-23
      • 1970-01-01
      • 1970-01-01
      • 2011-05-16
      相关资源
      最近更新 更多