强制 Sphinx 在 Python 文档字符串中解释 Markdown 而不是 reStructuredText答案

【问题标题】：Force Sphinx to interpret Markdown in Python docstrings instead of reStructuredText强制 Sphinx 在 Python 文档字符串中解释 Markdown 而不是 reStructuredText
【发布时间】：2019-09-27 11:10:58
【问题描述】：

我正在使用 Sphinx 来记录一个 python 项目。我想在我的文档字符串中使用 Markdown 来格式化它们。 即使我使用recommonmark 扩展名，它也只涵盖手动编写的.md 文件，而不是文档字符串。

我在我的扩展中使用autodoc、napoleon 和recommonmark。

如何在我的文档字符串中制作 sphinx 解析 markdown？

【问题讨论】：

A search of the docs 返回 this first result。
是的，这个结果谈到了recommonmark，它只涵盖了在markdown中编写手动文档的用例。该扩展不会使 sphinx 将您的文档字符串解析为降价。我编辑了我的问题以明确说明
你不是第一个问this question的人，但不幸的是，那里也没有答案。

标签： python markdown python-sphinx sphinx-napoleon

【解决方案1】：

Sphinx 的 Autodoc 扩展在每次处理文档字符串时都会发出一个名为 autodoc-process-docstring 的事件。我们可以使用该机制将语法从 Markdown 转换为 reStructuredText。

不幸的是，Recommonmark 没有公开 Markdown-to-reST 转换器。它将解析后的 Markdown 直接映射到 Docutils 对象，即 Sphinx 本身从 reStructuredText 内部创建的相同表示。

相反，我在我的项目中使用Commonmark 进行转换。因为它很快——例如，比Pandoc 快得多。速度很重要，因为转换是即时进行的，并单独处理每个文档字符串。除此之外，任何 Markdown-to-reST 转换器都可以。 M2R2 是第三个例子。其中任何一个的缺点是它们不支持 Recommonmark 的语法扩展，例如对文档其他部分的交叉引用。只是基本的 Markdown。

要插入 Commonmark 文档字符串转换器，请确保已安装该软件包 (pip install commonmark) 并将以下内容添加到 Sphinx 的配置文件 conf.py：

import commonmark

def docstring(app, what, name, obj, options, lines):
    md  = '\n'.join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)
    lines.clear()
    lines += rst.splitlines()

def setup(app):
    app.connect('autodoc-process-docstring', docstring)

与此同时，Recommonmark 在 2021 年 5 月为 deprecated。Sphinx 扩展 MyST 是一种功能更丰富的 Markdown 解析器，是推荐的 by Sphinx 和 by Read-the-Docs 的替代品。文档字符串中的 MyST does not yet support Markdown 也可以，但可以使用与上述相同的钩子通过 Commonmark 获得有限的支持。

此处概述的方法的一种可能替代方法是将MkDocs 与MkDocStrings 插件一起使用，这将完全从流程中消除Sphinx 和reStructuredText。

【讨论】：

很好的解决方案，谢谢！（我会使用lines[:] = rst.splitlines()）
令我惊讶的是，commonmark 解析 rst 格式的 cmets 还不错，但仍然会出错。因此，对于混合了 rst 和 MD cmets 的项目，我对此进行了简单的更改。如果文档字符串以特殊指示线开头，它只会将文档字符串解析为降价。（我有一个大型项目，我真的很想在 Markdown 中写一些很长的东西，但不想将所有文档字符串都转换为 Markdown）gist.github.com/…

【解决方案2】：

在@john-hennig 答案的基础上，以下内容将保留重组后的文本字段，例如：:py:attr:、:py:class: 等。这允许您引用其他类等。

import re
import commonmark

py_attr_re = re.compile(r"\:py\:\w+\:(``[^:`]+``)")

def docstring(app, what, name, obj, options, lines):
    md  = '\n'.join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)
    lines.clear()
    lines += rst.splitlines()

    for i, line in enumerate(lines):
        while True:
            match = py_attr_re.search(line)
            if match is None:
                break 

            start, end = match.span(1)
            line_start = line[:start]
            line_end = line[end:]
            line_modify = line[start:end]
            line = line_start + line_modify[1:-1] + line_end
        lines[i] = line

def setup(app):
    app.connect('autodoc-process-docstring', docstring)

【讨论】：

【解决方案3】：

我不得不扩展 john-hen 接受的答案，以允许将 Args: 条目的多行描述视为单个参数：

def docstring(app, what, name, obj, options, lines):
  wrapped = []
  literal = False
  for line in lines:
    if line.strip().startswith(r'```'):
      literal = not literal
    if not literal:
      line = ' '.join(x.rstrip() for x in line.split('\n'))
    indent = len(line) - len(line.lstrip())
    if indent and not literal:
      wrapped.append(' ' + line.lstrip())
    else:
      wrapped.append('\n' + line.strip())
  ast = commonmark.Parser().parse(''.join(wrapped))
  rst = commonmark.ReStructuredTextRenderer().render(ast)
  lines.clear()
  lines += rst.splitlines()

def setup(app):
  app.connect('autodoc-process-docstring', docstring)

【讨论】：

【解决方案4】：

当前的@john-hennig 很棒，但对于python 风格的多行Args: 似乎失败了。这是我的解决方法：


def docstring(app, what, name, obj, options, lines):
    md = "\n".join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)

    lines.clear()
    lines += _normalize_docstring_lines(rst.splitlines())


def _normalize_docstring_lines(lines: list[str]) -> list[str]:
    """Fix an issue with multi-line args which are incorrectly parsed.

    ```
    Args:
        x: My multi-line description which fit on multiple lines
          and continue in this line.
    ```

    Is parsed as (missing indentation):

    ```
    :param x: My multi-line description which fit on multiple lines
    and continue in this line.
    ```

    Instead of:

    ```
    :param x: My multi-line description which fit on multiple lines
        and continue in this line.
    ```

    """
    is_param_field = False

    new_lines = []
    for l in lines:
        if l.lstrip().startswith(":param"):
            is_param_field = True
        elif is_param_field:
            if not l.strip():  # Blank line reset param
                is_param_field = False
            else:  # Restore indentation
                l = "    " + l.lstrip()
        new_lines.append(l)
    return new_lines


def setup(app):
    app.connect("autodoc-process-docstring", docstring)

【讨论】：

我理解拥有 Google 风格的文档字符串的愿望。显然很多人已经习惯了这种风格。（您的答案是本页上的第二个重点。）但应该注意的是，这不是 Markdown。你在那里做的不是修复，它是 Markdown 语法的自定义扩展。