【问题标题】：Replacing python docstrings [closed]替换 python 文档字符串 [关闭]
【发布时间】：2010-03-19 14:12:03
【问题描述】：

我已经编写了一个epytext 到reST 标记转换器，现在我想将我整个库中的所有文档字符串从 epytext 转换为 reST 格式。

是否有一种聪明的方法可以读取模块中的所有文档字符串并写回替换内容？

ps：ast 模块？

【问题讨论】：

标签： python documentation documentation-generation docstring

【解决方案1】：

Pyment 是一个可以转换 Python 文档字符串并创建缺失的骨架的工具。它可以管理 Google、Epydoc（javadoc 风格）、Numpydoc、reStructuredText（reST，Sphinx 默认）文档字符串格式.

它接受单个文件或文件夹（也探索子文件夹）。对于每个文件，它将识别每种文档字符串格式并将其转换为所需的格式。最后，将生成一个补丁以应用于该文件。

转换您的项目：

安装 Pyment

键入以下内容（您可以使用 virtualenv）：

$ git clone https://github.com/dadadel/pyment.git
$ cd pyment
$ python setup.py install

从 Epydoc 转换为 Sphinx

您可以通过执行以下操作将您的项目转换为 Sphinx 格式 (reST)，这是默认输出格式：

$ pyment /my/folder/project

编辑：

使用 pip 安装：

$ pip install git+https://github.com/dadadel/pyment.git

【讨论】：

【解决方案2】：

这种简单的用法可能有点过头了，但我会考虑使用 2to3 的机制来进行编辑。您只需要编写一个自定义修复程序。它没有详细记录，但Python 3.0 开发人员指南：Python 2.6 和从 2 迁移到 3：More about 2to3 和 Implement Custom Fixers 提供了足够的详细信息以供入门...

Epydoc 似乎包含一个to_rst() 方法，可以帮助您实际翻译文档字符串。不知道好不好...

【讨论】：

【解决方案3】：

可能是最直接的方法，只是用老式的方式来做。这里有一些初始代码可以帮助你。它可能更漂亮，但应该给出基本的想法：

def is_docstr_bound(line):
    return "'''" in line or  '"""' in line

# XXX: output using the same name to some other folder
output = open('output.py', 'w')

docstr_found = False
docstr = list()
with open('input.py') as f:
    for line in f.readlines():
        if docstr_found:
            if is_docstr_bound(line):
                # XXX: do conversion now
                # ...

                # and write to output
                output.write(''.join(docstr))

                output.write(line)

                docstr = list()
                docstr_found = False
            else:
                docstr.append(line)
        else:
            if is_docstr_bound(line):
                docstr_found = True

            output.write(line)

output.close()

要使其真正发挥作用，您需要将其与文件查找器连接起来，并将文件输出到其他目录。查看os.path 模块以供参考。

我知道文档字符串绑定检查可能真的很弱。稍微加强一下可能是个好主意（带状线并检查它是否以文档字符串绑定开始或结束）。

希望这能让您了解如何进行。也许有一种更优雅的方式来处理这个问题。 :)

【讨论】：

浏览我的目录结构并打开/读取/写入文件是微不足道的。我的问题是：是否有一种聪明的方法来读取模块中的所有文档字符串并写回替换内容？这不能用正则表达式之类的机制天真地完成（比如 re.finditer('\"\"\"(.*)\"\"\"', source)），因为我不想搞砸其余的代码。
我发现了一个您可能会感兴趣的类似问题。见stackoverflow.com/questions/768634/…。
Docstrings 不需要有三引号字符串，并不是所有用三引号字符串引用的都是文档字符串，所以这只适用于 python 文档字符串的子集。

【解决方案4】：

我想知道内省和源代码处理的结合。这是一些未经测试的伪代码：

import foo #where foo is your module

with open('foo.py',r) as f:
    src = f.readlines()

for pything in dir(foo):  #probably better ways to do this...
    try:
       docstring = pything.__doc__
    except AttributeError:
       #no docstring here
       pass

    #modify the docstring
    new_docstring = my_format_changer(docstring)

    #now replace it in the source
    src = src.replace(docstring, new_docstring)

#When done, write it out
with open('new_foo.py','w') as fout:
    fout.write(src)

显然，您必须在遍历模块以查找具有文档字符串的对象以便递归的代码中添加一些技巧，但这为您提供了总体思路。

【讨论】：