Perl 查找和替换（带编号替换）进入无限循环答案

【问题标题】：Perl find-and-replace (with numbered replacement) enters infinite loopPerl 查找和替换（带编号替换）进入无限循环
【发布时间】：2016-05-11 07:22:43
【问题描述】：

今天早些时候我发布了一个类似的问题，其解决方案导致了一个新问题，-,-

好吧，故事是我希望 Perl 从文本中捕获 cmets，将它们存储在数组中，然后用新编号的 cmets 替换它们，比如原始 $txt：

//first comment
this is a statement //second comment
//third comment
more statements //fourth comment

我想将 4 个 cmets 推入一个数组，并获得新的 $txt ，例如：

//foo_0
this is a statement //foo_1
//foo_2
more statements //foo_3

我尝试了以下 Perl：

$i=0;
$j=0;
#while ($txt =~ s/(\/\/.*?\n)/\/\/foo_$i\n/gs) {
#while ($txt =~ s/(\/\/.*?\n)/\/\/foo_$i\n/s) {
#foreach ($txt =~ s/(\/\/.*?\n)/\/\/foo_$i\n/gs) {
foreach ($txt =~ s/(\/\/.*?\n)/\/\/foo_$i\n/s) {
        if(defined $1) {
                push (@comments, $1);
                print " \$i=$i\n";
                $i++
                }
        print " \$j=$j\n";
        $j++;
        }

print "after search & replace, we have \$txt:\n";
print $txt;

foreach (0..$#comments) {
        print "\@comments[$_]= @comments[$_]";
        }

在其中，我尝试了四种口味的“while/foreach (... s///gs)”，但它们都没有真正做到我想要的。

“foreach”语句只对文本起作用一次；更糟糕的是，“while”语句将进入无限循环，似乎新的“//foo_xx”内容被放回字符串中以进行进一步的搜索操作，使其成为无限迭代。奇怪的是，这么看似简单的查找替换机制，竟然会陷入死循环，还是有一些我不知道的明显技巧？

顺便说一句，我已经浏览了by highsciguy 的帖子。对他来说，“只需在上面的代码中用 foreach 替换 while 就可以了”；但对我来说，foreach 不起作用，我不知道为什么。

任何人有任何想法可以帮助我解决这个问题吗？谢谢~

【问题讨论】：

您应该始终使用use strict; use warnings;，因为知道自己的错误总比不知道要好。
是的，3q 为您提供建议；我只是喜欢 perl 的松散和灵活的语法，从未意识到“严格警告”的重要性。我试试看。
当你开始使用它时，你就会开始明白你在做什么，你也会明白为什么有时事情不工作。老实说，不打开它们工作就像蒙着眼睛工作。
对。我会努力坚持下去。 3q~

标签： regex perl replace find infinite-loop

【解决方案1】：

我会以不同的方式处理它 - 一个 while 循环逐行读取文件句柄，并从中“抓取”所有注释行。

类似这样的：

#!/usr/bin/perl

use warnings;
use strict;

my @comments; 

#iterate stdin or filename specified on command line
while ( <> ) { 
   #replace anything starting with // with foo_nn
   #where nn is current number of comments. 
   s,//(.*),"//foo_".@comments,e && push (@comments, $1 );
   #$1 is the contents of that bracket - the string we replaced
   #stuff it into commments; 

   #print the current line (altered by the above)
   print;
}
#print the comments. 
print "Comments:\n", join "\n", @comments;

不解决重复问题，如果你有 // 引号或其他东西，它会中断，但对你的例子有用。 while 基于文件句柄逐行迭代。如果您的文本 blob 已经有了标量，那么您可以使用 foreach ( split ( "\n", $text ) ) { 完成相同的操作

输出：

//foo_0
this is a statement //foo_1
//foo_2
more statements //foo_3
Comments:
first comment
second comment
third comment
fourth comment

【讨论】：

.@comments,e; 做什么？
e 正则表达式修饰符是 eval。它evals@comments。 . 连接它，并强制一个标量上下文，因此它“返回”一个等于元素数量的数字 - 因此如果 @comments 有 4 个元素，它将返回 //foo_4 作为替换。跨度>
@syck - 使用示例数据，但是是的 - 如果有空行，似乎会这样做。（大概是因为没有匹配，但$1 仍然被定义）。已修改。
甚至无法取消设置
很好，一次搜索并替换一行真的打破了无限循环！但这是为什么呢？在整个文本文件（带有换行符）上使用 s/// 会陷入无休止的递归搜索循环，而在单行上则不会。为什么？

【解决方案2】：

遍历文本的每一行，如果替换成功，存储注释：

#!/usr/bin/perl

use strict;
use warnings;

my $txt = <<END;                        # define text
//first comment
this is a statement //second comment
//third comment
more statements //fourth comment
END

my @comments = ();
my $i = 0;
foreach (split qq(\n), $txt) {          # iterate over input lines
        if (s&(//.*)&//foo_$i&) {       # do we match?
                push @comments, $1;     # then push comment
                $i++;                   # and increase counter
                }
        print;                          # print modified row
        print qq(\n);                   # print newline
        }

print qq(\nComments:\n);
foreach (@comments) {
        print;                          # print the comment
        print qq(\n);                   # print newline
        }

【讨论】：

3q~:-) 我只是不明白，为什么将输入文件分成单独的行会阻止 s/// 内容陷入无限循环？而且，除此之外， qq() 比 "" 有什么好处？
如上所述@sobrique，使用while 将与修改后的评论重新匹配。为了安全起见，请使用@matches = $txt =~ /<regex>/g； @matches 然后可以用来迭代或其他什么。使用 qq() 可以在不转义的情况下插入引号（此处不适用），其余的只是个人喜好。很抱歉我的回复晚了，我已经有几天没有参加了。
知道了，感谢您的细心解释！很抱歉这个迟到的回复；春节让国内的一切都陷入了相当长时间的停滞。