从 bibtex 文件中提取评论组（在 Python 中？）答案

【问题标题】：extract comment groups from bibtex file (in Python?)从 bibtex 文件中提取评论组（在 Python 中？）
【发布时间】：2018-04-10 18:34:12
【问题描述】：

我正在尝试扩展一些可用于管理 BibDesk 中的组的功能集，并且我想通过程序操作 BibDesk 为静态组写下信息的 bibtex cmets。

为此，我需要一种系统且可靠的方法来获取 bibtex 文件的此注释部分中的所有内容。

@comment{BibDesk Static Groups{
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<array>
    <dict>
        <key>group name</key>
        <string>MyGroupName</string>
        <key>keys</key>
        <string>BitexRefId1,BitexRefId2</string>
    </dict>
</array>
</plist>
}}

一旦我接触到 XML array，我想我知道如何处理它，但第一部分，获取 @comment{BibDesk Static Groups{ 对我来说有点棘手。我会知道如何使用sed 使用sed -e '/@comment{BibDesk Static Groups{/,/}/!d' test.bib，但是pythonic 的方法是什么？我最好的东西本质上是一个自制的解析器

file = open(file_name,"r")
for line in file:
    if  static_groups_group:
        if "}" in line:
            static_groups_group=False
            print "ending static group block"
    if  static_groups_group:
        xml_groups.append(line)
    if "@comment{BibDesk Static Groups{" in line:
        print line," found"
        static_groups_group=True

【问题讨论】：

这个库可能会这样做：bibtexparser.readthedocs.io

标签： python parsing sed bibtex

【解决方案1】：

这是您的sed 命令的快速而肮脏的翻译。不过我不一定会推荐这种方法，因为它不是特别健壮。

import re

with open(file_name) as fp:
    text = fp.read()

groups = re.findall(r'\@comment\{BibDesk Static Groups\{(.*?)\}\}', text, re.DOTALL)

【讨论】：