【问题标题】:How can I parse Python's triple-quote f-strings?如何解析 Python 三引号 f 字符串?
【发布时间】:2020-07-26 00:26:27
【问题描述】:

我有这段代码可以解析和处理普通的“f-string”模板字符串(请参阅下面的用法部分以获取示例):

from string import Formatter
import sys


_conversions = {'a': ascii, 'r': repr, 's': str}

def z(template, locals_=None):
    if locals_ is None:
        previous_frame = sys._getframe(1)
        previous_frame_locals = previous_frame.f_locals
        locals_ = previous_frame_locals
        # locals_ = globals()
    result = []
    parts = Formatter().parse(template)
    for part in parts:
        literal_text, field_name, format_spec, conversion = part
        if literal_text:
            result.append(literal_text)
        if not field_name:
            continue
        value = eval(field_name, locals_) #.__format__()
        if conversion:
            value = _conversions[conversion](value)
        if format_spec:
            value = format(value, format_spec)
        else:
            value = str(value)
        result.append(value)
    res = ''.join(result)
    return res

用法:

a = 'World'
b = 10
z('Hello {a} --- {a:^30} --- {67+b} --- {a!r}')
# "Hello World ---             World              --- 77 --- 'World'"

但是如果模板字符串是这样的就不行了:

z('''
echo monkey {z("curl -s https://www.poemist.com/api/v1/randompoems | jq --raw-output '.[0].content'")} end | sed -e 's/monkey/start/'
echo --------------
''')

它给出了这个错误:

  File "<string>", line 1
    z("curl -s https
                   ^
SyntaxError: EOL while scanning string literal

如果无法正常运行,我什至愿意从 Python 的源代码中复制代码以使其正常工作。

【问题讨论】:

  • 如果你想解析Python代码,你可以看看ast module。它允许您像解析常规 f 字符串一样解析字符串:ast.parse('f"Hello, {a} --- {67+b}"')。然后你想要生成的树并按照你想要的方式处理它
  • 冒号在格式字符串中的{} 中具有特殊含义。您需要将curl 部分拉出到一个单独的变量中,而不是嵌套调用z()
  • @0x5453 不,它适用于三引号 f 字符串。我检查过。 (其中引用了:。)
  • @ForceBru 您的方法似乎很棒。有没有办法评估解析的ast 的节点?例如 _ast.FormattedValuestring?
  • @HappyFace,你可以ast.dump(tree_node)查看每个节点有哪些属性。然后遍历带有ast.NodeVisitor 子类的树,并检查每个节点的属性。对于f"{a}",您可以像这样检索字符串aFormattedValue_node.value.id。详情请见Python's grammar

标签: python string parsing quoting f-string


【解决方案1】:

感谢@ForceBru 的提示,我完成了这个。以下代码解析和处理源三引号 f-strings:(忽略处理部分)

_conversions = {'a': ascii, 'r': repr, 's': str}

def zstring(self, template, locals_=None, getframe=1):
    if locals_ is None:
        previous_frame = sys._getframe(getframe)
        previous_frame_locals = previous_frame.f_locals
        locals_ = previous_frame_locals

    def asteval(astNode):
        if astNode is not None:
            return eval(compile(ast.Expression(astNode), filename='<string>', mode='eval'), locals_)
        else:
            return None

    def eatFormat(format_spec, code):
        res = False
        if format_spec:
            flags = format_spec.split(':')
            res = code in flags
            format_spec = list(filter(lambda a: a != code,flags))
        return ':'.join(format_spec), res


    p = ast.parse(f"f'''{template}'''")
    result = []
    parts = p.body[0].value.values
    for part in parts:
        typ = type(part)
        if typ is ast.Str:
            result.append(part.s)
        elif typ is ast.FormattedValue:
            # print(part.__dict__)

            value = asteval(part.value)
            conversion = part.conversion
            if conversion >= 0:
                # parser doesn't support custom conversions
                conversion = chr(conversion)
                value = self._conversions[conversion](value)

            format_spec = asteval(part.format_spec) or ''
            # print(f"orig format: {format_spec}")
            format_spec, fmt_eval = eatFormat(format_spec, 'e')
            format_spec, fmt_bool = eatFormat(format_spec, 'bool')
            # print(f"format: {format_spec}")
            if format_spec:
                value = format(value, format_spec)
            if fmt_bool:
                value = boolsh(value)

            value = str(value)
            if not fmt_eval:
                value = self.zsh_quote(value)
            result.append(value)
    cmd = ''.join(result)
    return cmd

【讨论】:

    最近更新 更多