如何用 sed 查找和替换，除非在花括号之间？答案

【问题标题】：How to find and replace with sed, except when between curly braces?如何用 sed 查找和替换，除非在花括号之间？
【发布时间】：2022-01-22 06:28:38
【问题描述】：

我有一个这样的命令，它是标记单词出现在文档的索引中：

sed -i "s/\b$line\b/\\\keywordis\{$line\}\{$wordis\}\{$definitionis\}/g" file.txt

问题是，它在现有匹配中查找匹配，这意味着它的例如“hello”被替换为\keywordis{hello}{a common greeting}，但随后“greeting”也可能被搜索到，\keywordis{hello}{a common \keywordis{greeting}{a phrase used when meeting someone}}...

如何让 sed 执行替换，但忽略大括号内的文本？

在这种情况下，大括号总是出现在同一行。

【问题讨论】：

为什么是sed？为什么不使用实际的编程语言？ but then "greeting" might be searched too, an 在 sed 中创建状态机非常困难。这是“可能的”，但在sed 中这样做是没有意义的，除了学术目的。用 Perl 或 Python 编写一个真正的解析器。
$line的内容是什么？你似乎在问 XY 问题——你问的是 sed。您不想问如何将特定格式应用于您的 Latex 文档吗？

标签： sed replace

【解决方案1】：

如何让 sed 执行替换，但忽略大括号内的文本？

首先标记化输入。在每个\keywordis{hello}{a common greeting} 之间放置一些独特的东西，例如| 或字节\x01，并将其存储在保持空间中。 s/\\the regex to match{hello}{a common greeting}/\x01&\x01/g'.

十次迭代保持空间中的元素。使用 \n 将已解析的元素与未解析的元素分开 - 输入与输出。如果元素与格式\keywordis{hello}{a common greeting} 匹配，只需将其移动到保留空间中换行符之前的前面，如果不匹配，则执行替换。这是一个示例：Identify and replace selective space inside given text file，它使用双换行符 \n\n 作为输入/输出分隔符。

因为，正如您所指出的，替换可能与您正在搜索的模式有重叠的单词，我相信最简单的方法是在每次替换之后重新调整模式空间，例如准备好输出并开始当前行的整个过程。

然后在最后，洗牌保留空间以删除 \x01 和换行符以及任何剩余和输出。

总的来说，它是乳胶。我相信手动操作会更简单。

通过从后面“吃掉”字符串并将其放在模式空间内的输入/输出分隔符前面，我简化了这个过程。以下程序：

sed '
    # add our input/output separator - just a newline
    s/^/\n/

    : loop
    # l1000
    # Ignore any "\keywords" and "{stuff}"
    /^\([^\n]*\)\n\(.*\)\(\\[^{}]*\|{[^{}]*}\)$/{
        s//\3\1\n\2/
        b loop
    }
    # Replace hello followed by anthing not {}
    # We match till the end because regex is greedy
    # so that .* will eat everything.
    /^\([^\n]*\)\n\(.*\)hello\([{}]*\)$/{
        s//\\keywordis{hello}{a common greeting}\3\1\n\2/
        b loop
    }
    # Hello was not matched - ignore anything irrelevant
    # note - it has to match at least one character after newline       
    /^\([^\n]*\)\n\(.*\)\([^{}]\+\)$/{
        s//\3\1\n\2/
        b loop
    }

    s/\n//
' <<<'
\keywordis{hello}{hello} hello {some other hello} another hello yet
'

输出：

\keywordis{hello}{hello} \keywordis{hello}{a common greeting} {some other hello} another \keywordis{hello}{a common greeting} yet

【讨论】：