在保留注释的同时修改 python AST答案

【问题标题】：Modifying python AST while preserving comments在保留注释的同时修改 python AST
【发布时间】：2015-01-16 06:22:17
【问题描述】：

我目前正在使用 python 中的 AST。我接收一个 python 文件，生成它的 AST，修改它，然后重新编译回源代码。我正在使用一个将 getter 添加到类的转换器（我正在使用带有 ast.NodeTransformer 的访问者模式）。目前我的代码按预期工作，但不保留 cmets，这是我的问题。以下是我的代码：

#visits nodes and generates getters or setters
def genGet(file,type,func):
    global things
    things['func'] = func
    things['type'] = type
    with open(file) as f:
        code = f.read()             #get the code
    tree = ast.parse(code)          #make the AST from the code
    genTransformer().visit(tree)    #lets generate getters or setters depending on type argument given in our transformer so the genTransformer function
    source = meta.asttools.dump_python_source(tree) #recompile the modified ast to source code
    newfile = "{}{}".format(file[:-3],"_mod.py")
    print "attempting to write source code new file: {}".format(newfile) #tell everyone we will write our new source code to a file
    outputfile = open(newfile,'w+')
    outputfile.write(source)        #write our new source code to a file
    outputfile.close()


class genTransformer(ast.NodeTransformer):
    ...

我已经对 lib2to3 进行了一些研究，这显然可以保留 cmets，但到目前为止还没有发现任何有助于解决我的问题的东西。例如，我找到了下面的代码，但并不真正理解它。它似乎保留了 cmets 但不允许我的修改。运行时出现缺少属性错误。

import lib2to3
from lib2to3.pgen2 import driver
from lib2to3 import pygram, pytree
import ast

def main():
    filename = "%s" % ("exfunctions.py")
    with open(filename) as f:
        code = f.read()
    drv = driver.Driver(pygram.python_grammar, pytree.convert)
    tree = drv.parse_string(code, True)
    # ast transfomer breaks if it is placed here
    print str(tree)
    return

在转换 AST 时，我无法找到保存 cmets 的包或策略。到目前为止，我的研究对我没有帮助。我可以使用什么来修改 AST 并保留 cmets？

【问题讨论】：

注释不是 AST 的一部分，就像它们不是为 Python 源代码生成的字节码的一部分一样。与空行一样，它们在创建 AST 节点时被丢弃。
2to3 库使用自己的分词器和解析器； lib2to3.pgen2.tokenize 源包含注释：它旨在完全匹配 Python 标记器的工作，除了它为 cmets 生成 COMMENT 标记并为所有运算符提供类型 OP
@IraBaxter：不，它的自定义解析器会保留它们。
lib2to3解析器生成的解析树与AST模块不兼容；我想你得看看你能不能把fixers support变成你可以重复使用的东西。
@IraBaxter：AST 使用普通的 Python 解析器。 lib2to3 解析器生成自己的树（与ast 模块不兼容）。它保留了 cmets； lib2to3.pytree 中的评论指出：这是一个非常具体的解析树；我们需要保留每个令牌，甚至令牌之间的 cmets 和空格。

标签： python abstract-syntax-tree transformer

【解决方案1】：

LibCST 是一个 Python 具体语法树解析器和工具包，可用于解决您的问题。它提供了一个看起来像 ast 的语法树，但保留了格式信息。它还提供了用于树修改的转换器模式。

https://github.com/Instagram/LibCST/

https://libcst.readthedocs.io/en/latest/index.html

import libcst as cst

class NameTransformer(cst.CSTTransformer):
    def leave_Name(self, original_node, updated_node):
        return cst.Name(updated_node.value.upper())

使用这样的 NameTransformer，我们可以将源代码中的所有名称转换为大写：

>>> m = cst.parse_module("def fn(): return (value)")
>>> m.visit(NameTransformer()).code

'def FN(): return VALUE'

【讨论】：