【问题标题】:How to make Python Yaml library save in a human-friendly way?如何让 Python Yaml 库以人性化的方式保存?
【发布时间】:2018-12-13 17:25:38
【问题描述】:

这是我得到的 Python 代码:

d = {'ToGoFirst': 'aaa', 'Second': 'bbb', 'Pagargaph':
'''Lorem ipsum dolor sit amet, 
consectetur adipiscing elit, 
sed do eiusmod tempor incididunt 
ut labore et dolore magna aliqua.''',  
'Integer': 25}
with open('d.yaml', 'w') as f:
    yaml.safe_dump(d, f, default_flow_style=False)

我不断得到的:

Integer: 25
Pagargaph: "Lorem ipsum dolor sit amet, \nconsectetur adipiscing elit, \nsed do eiusmod\
  \ tempor incididunt \nut labore et dolore magna aliqua."
Second: bbb
ToGoFirst: aaa

如何将其更改为生产:

ToGoFirst: aaa
Second: bbb
Pagargaph: 
  Lorem ipsum dolor sit amet, 
  consectetur adipiscing elit, 
  sed do eiusmod tempor incididunt 
  ut labore et dolore magna aliqua.
Integer: 25

换句话说,我想:

  1. 避免在输出中使用引号和转义字符,以便非技术用户可以阅读和编辑这些配置文件。

  2. 最好保留参数的顺序。

这是为了能够加载 YAML 文件,添加更多参数,并且仍然能够以人类友好的格式保存它。

【问题讨论】:

  • 您的预期/期望输出示例是错误的。你不能用一个简单的标量产生你所拥有的东西。您有多行带有尾随空格,pyyaml 决定为此使用双引号。另一种选择是文字块标量。
  • 参数的顺序可能会保留在 pyyaml 的下一个版本中(使用 python 3.6/3.7 运行时)。目前它总是对键进行排序,但有一个 pull request 允许禁用它。或者尝试 ruamel.yaml

标签: python file configuration yaml human-readable


【解决方案1】:

您的输出在Pagargaph 的值中没有换行符,因为您需要有一个block-style literal scalar(破折号修剪最后的换行符,您通常会在加载这样的标量时得到):

Pagargaph: |-
  Lorem ipsum dolor sit amet, 
  consectetur adipiscing elit, 
  sed do eiusmod tempor incididunt 
  ut labore et dolore magna aliqua.

你应该使用ruamel.yaml(免责声明:我是那个包的作者),它是专门为支持这种往返而开发的。得到你想做的事,例如:

import sys
import ruamel.yaml
from ruamel.yaml.scalarstring import PreservedScalarString as L

yaml_str = """\
ToGoFirst: aaa
Second: 'bbb'  # insert after this one
Integer: 25
"""

yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
d = yaml.load(yaml_str)
# yaml.indent(mapping=4, sequence=4, offset=2)
try:
    before_integer = [k for k in d].index('Integer')
except ValueError:
    before_integer = len(d)
d.insert(before_integer, 'Pagargaph', L('''Lorem ipsum dolor sit amet, 
consectetur adipiscing elit, 
sed do eiusmod tempor incididunt 
ut labore et dolore magna aliqua.'''))  
d.insert(before_integer, 'Something', 'extra', comment='with a comment')
yaml.dump(d, sys.stdout)

导致:

ToGoFirst: aaa
Second: 'bbb'  # insert after this one
Something: extra # with a comment
Pagargaph: |-
  Lorem ipsum dolor sit amet, 
  consectetur adipiscing elit, 
  sed do eiusmod tempor incididunt 
  ut labore et dolore magna aliqua.
Integer: 25

请注意:

  • 在 ruamel.yaml (2.7, 3.4+) 支持的任何 Python 版本中都会保留该顺序
  • 评论被保留
  • 我在bbb 周围添加的引号仅在您指定yaml.preserve_quotes = True 时才会保留
  • 由于我们在位置 2 插入两次,后者将前者撞到位置 3。

您的用户必须有一定的纪律性,他们才能编辑 YAML 文件而不破坏它。他们还应该知道一些注意事项,例如普通(非引号)标量,不能以某些特殊字符开头或包含特殊字符序列(: 后跟空格,# 前面是空格)

为了帮助您的用户避免编辑错误,您可以尝试在 YAML 文档的开头添加以下注释:

# please read the first few "rules" of How_to_edit at the bottom of this file

最后:

How_to_edit: |
 Editing a YAML document is easy, but there are some rules to keep you from 
 accidently invoking its hidden powers. Of the following you need at least 
 read and apply the ones up to the divider separating the important from less 
 important rules. The less important ones are interesting, but you probably 
 won't need to know them.
 1) Entries in this file consist of a scalar key (before the ':') and a scalar 
    value (normally after the ':', but read rule 3). 
 2) Scalars do NOT need quotes (single: ', or double: ") around them, unless 
    you have a special character or characters combinations at the beginning 
    ('%', '*', '&', '{', '[', '- ') or in the middle  (': ', ' #) of the  scalar.
    If you add quotes use a single quote before and after the scalar . If 
    these are superfluous the program can remove them. So when in doubt just 
    add them.
 3) A key followed by ': |' introduces a multiline scalar. These instructions
    are in a multiline scalar. Such a scalar starts on the next line after ': |'.
    The lines need to be indented, until the end of the scalar and these 
    indentation spaces are not part of the scalar. 
    The newlines in a multiline sclar are hard (i.e. preserved, and not 
    substituted with spaces).
    If you see `: |-` that means the scalar is loaded with the trailing newline 
    stripped.
 4) Anything after a space followed by a hash (' #') is a comment, when not 
    within quotes or in a multiline string.
 --- end of the important rules ---
 5) Within single quoted scalars you can have a single quote by doubling it: 
       rule 4: 'you probably don''t ever need that'
    This is called escaping the single quote. You can double quote scalars, but 
    the rules for escaping are much more difficult, so don't try that at home.
 6) The scalars consisting solely of "True" and "False" (also all-caps and 
    all-lowercase) are loaded as booleans when unquoted, and as strings when 
    quoted. 
 7) Scalars consisting solely of number characters (0-9) are loaded as numbers.
    If there is a non-number they are usually loaded as strings, but scalars 
    starting with '0x' and '0o' and for the rest have only number characters,
    are special and need quotes if not intended as (hexadecimal resp. octal) 
    numbers.

如果包含上述内容,您可能不想在往返时保留引号。

【讨论】:

  • 谢谢。我可以去掉“|-”但仍然满足 YAML 规范吗?
  • 是的,您可以,但是您的标量中不会有任何换行符,您看到的换行符以及行首的所有后续空格都将被替换为空格。这与您的“Lorem ipsum ...”不同,它具有三个嵌入的换行符(当然,这可能对您的应用程序没有影响,但我不知道)。
  • 我的意思是我可以以某种方式摆脱“|-”并且仍然有换行符吗?换句话说,我怎样才能使输出完全不适合技术人员?
  • 你不能。你的技术人员必须学习一些东西,比如什么时候引用,什么时候不引用,他们还需要缩进等等。如果您的数据结构仅包含根级映射(没有嵌套映射和序列),您应该能够通过一些预处理对其进行解析,但在这种情况下您也可以不使用 YAML 进行解析。
  • 我更新了我的答案,建议添加基本的编辑说明。但它们并不完整。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2010-10-22
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多