Perl 搜索和替换进入无限循环答案

【问题标题】：Perl search and replace enters endless loopPerl 搜索和替换进入无限循环
【发布时间】：2013-05-19 00:29:58
【问题描述】：

我正在尝试使用匹配和替换多个文件中的某些字符串

local $/;
open(FILE, "<error.c");
$document=<FILE>;
close(FILE);
$found=0;
while($document=~s/([a-z_]+)\.h/$1_new\.h/gs){
    $found=$found+1;
};
open(FILE, ">error.c");
print FILE "$document";
close(FILE);'

它进入一个无限循环，因为替换的结果再次被搜索的正则表达式匹配。但是s///g 结构不应该避免这种情况吗？

编辑：

我发现foreach 循环也不会完全符合我的要求（它将替换所有出现的位置，但只打印其中一个）。原因似乎是 perl 替换和搜索在 foreach() 和 while() 构造中的行为完全不同。为了有一个解决方案来替换多个文件，同时输出所有单独的替换，我想出了以下body：

# mandatory user inputs
my @files;
my $subs;
my $regex;

# additional user inputs
my $fileregex = '.*';
my $retval = 0;
my $opt_testonly=0;

foreach my $file (@files){

    print "FILE: $file\n";
    if(not($file =~ /$fileregex/)){
        print "filename does not match regular expression for filenames\n";
        next;
    }

    # read file
    local $/; 
    if(not(open(FILE, "<$file"))){ 
        print STDERR "ERROR: could not open file\n"; 
        $retval = 1; 
        next; 
    };
    my $string=<FILE>; 
    close(FILE); 

    my @locations_orig;
    my @matches_orig;
    my @results_orig;

    # find matches
    while ($string =~ /$regex/g) {
        push @locations_orig, [ $-[0], $+[0] ];
        push @matches_orig, $&;
        my $result = eval("\"$subs\"");
        push @results_orig, $result;
        print "MATCH: ".$&." --> ".$result." @[".$-[0].",".$+[0]."]\n";
    }

    # reverse order
    my @locations = reverse(@locations_orig);
    my @matches = reverse(@matches_orig);
    my @results = reverse(@results_orig);

    # number of matches
    my $length=$#matches+1;
    my $count;

    # replace matches
    for($count=0;$count<$length;$count=$count+1){
        substr($string, $locations[$count][0], $locations[$count][1]-$locations[$count][0]) = $results[$count];
    }

    # write file
    if(not($opt_testonly) and $length>0){
        open(FILE, ">$file"); print FILE $string; close(FILE);
    }

}

exit $retval;

它首先读取文件创建匹配列表、它们的位置和每个文件中的替换文本（打印每个匹配）。其次，它将替换从字符串末尾开始的所有事件（为了不更改先前消息的位置）。最后，如果找到匹配项，它会将字符串写回文件。当然可以更优雅，但它可以满足我的需求。

【问题讨论】：

... 如果我没记错的话，s///g 构造会一次性完成所有您的替换，而不是一次替换一个。你根本不需要一个循环。在其他新闻中：为什么不sed？
是的，你完全正确。循环的原因是我想计算所指示的匹配数（并且可能在将来输出匹配的内容）。我不使用 sed 因为我想要精确的 perl 语法。该代码将成为 shell 脚本的一部分。
要获取匹配数，可以$num_matches = ($data =~ s/([a-z_]+)\.h/$1_new\.h/g)
不错，@AleksG。我打算把他指向stackoverflow.com/questions/1849329/…
谢谢！ match 的输出不能这样实现，可以吗？主要是我很困惑，因为我一直认为while(s///g){} 在替换过程中向前迈进了一步，即不会再次替换以前的匹配项，或者这只是针对while(m//g){} 或foreach？

标签： regex perl

【解决方案1】：

$1_new 仍将是 match ([a-z_]+)。它进入一个无限循环，因为你在那里使用。使用s///g 构造，一次迭代将替换字符串中的每一次出现。

要计算替换使用：

$replacements = () = $document =~ s/([a-z_]+)\.h/$1_new\.h/gs;

$replacements 将包含替换匹配的数量。

如果您基本上只想要匹配项，而不是替换项：

@matches = $document =~ /([a-z_]+)\.h/gs;

然后您可以通过$replacement = scalar @matches 获取他们的计数。

【讨论】：

请看我上面的评论！
我已经扩展了我的答案以反映您的需求。

【解决方案2】：

我会说你过度设计了这个。我过去这样做过：

perl -i -p -e 's/([a-z_]+)\.h/$1_new\.h/g' error.c

当替换的字符串包含匹配模式时，这可以正常工作。

【讨论】：

请看我上面的评论！

【解决方案3】：

/g 选项本身就像一个循环。我想你想要这个：

while($document=~s/([a-z_]+)(?!_new)\.h/$1_new\.h/s){
    $found=$found+1;
};

因为您正在用自身替换匹配项以及更多内容，所以您需要一个否定的前瞻断言。

【讨论】：