从python中的字符串中删除LaTeX字符[重复]答案

【问题标题】：Removing LaTeX characters from string in python [duplicate]从python中的字符串中删除LaTeX字符[重复]
【发布时间】：2018-05-02 06:55:24
【问题描述】：

我有一个这样的 TeX 文档

s = '\textbf{1 + 1} \begin{center} \textbf{some text in here:} \end{center} and \textbf{2} etc'

我想删除\textbf{ 和右大括号}

所以最终的文本看起来像这样

1 + 1 \begin{center} some text in here: \end{center} and 2 etc'

这是我迄今为止尝试过的

import re 

re.sub(r'\textbf{(.*)}', '\\1', s)

【问题讨论】：

向我们展示您尝试解决此问题的方法。它有效吗？如果没有，什么不起作用？
一种方法是使用 github.com/alvinwan/texsoup。这对于当前任务来说已经过分了，但是如果您的任务变得稍微复杂一些（例如，仅替换 \begin{...}...\end{...} 环境中的粗体文本），TexSoup 将很有用 soup = TexSoup(r"\textbf{1 + 1} \begin{center} \textbf{some text in here:} \end{center} and \textbf{2} etc"); [sub.replace(sub.args[0]) for sub in soup.find_all('textbf')]; print(soup) 免责声明：我编写了这个库。此外，当您不使用返回值时，使用列表理解是一种不好的做法
@alvinwan 这太棒了。谢谢！！

【解决方案1】：

您可以使用以下正则表达式：

\\textbf{([^}]*)}

解释：

你真的很接近一个工作正则表达式：你只需要转义第一个 \（否则 \t 将被解释为 tab）并添加条件以接受除了 @987654326 之外的弯曲括号内的所有字符@这是由[^}]完成的

输出：

1 + 1 \begin{center} some text in here: \end{center} and 2 etc

阅读材料：

http://www.rexegg.com/regex-quickstart.html

【讨论】：