bash：从文本文件中过滤掉连续的行答案

【问题标题】：bash: filter away consecutive lines from text filebash：从文本文件中过滤掉连续的行
【发布时间】：2011-01-26 01:35:35
【问题描述】：

我想从许多文件中删除段落的每个实例。我将段落称为一系列行。

例如：

我的第一行我的第二行我的第三行第四个第 5 次也是最后一次

问题是我只想在它们作为一个组出现时删除它们。例如，如果

我的第一行

单独出现我不想删除它。

【问题讨论】：

标签： perl bash string sed text-processing

【解决方案1】：

@OP，我看到你接受了你的段落句子是“硬编码”的答案，所以我认为这些段落总是相同的？这是真的，你可以使用grep。将要删除的段落存储在文件中，例如“过滤器”，然后使用 grep 的-f 和-v 选项来完成这项工作，

grep -v -f filter file

【讨论】：

【解决方案2】：

如果你会使用 Perl，你可以像这样在一行中完成：

perl -0777 -pe 's/my first line\nmy second line\nmy third line\nthe fourth\n5th and last\n//g' paragraph_file

解释在perlrun:

特殊值 00 将导致 Perl 在段落模式下 slurp 文件。值 0777 将导致 Perl 读取整个文件，因为没有具有该值的合法字节。

示例输入：

my first line
my second line
my third line
the fourth
5th and last
hey
my first line
my second line
my third line
the fourth
5th and last

hello
my first line

输出：

$ perl -0777 -pe 's/my first line\nmy second line\nmy third line
\nthe fourth\n5th and last\n//g' paragraph_file
hey

hello
my first line

【讨论】：

【解决方案3】：

你可以用 sed 做到这一点：

sed '$!N; /^\(.*\)\n\1$/!P; D' file_to_filter

【讨论】：

这是如何使用的？在哪里指定过滤器？
如果您的文件名为“file_to_filter”，则回复中的该命令将输出您的文件并删除重复行。