Perl 变量在循环中变得未定义答案

【问题标题】：Perl variable becoming undefined in loopPerl 变量在循环中变得未定义
【发布时间】：2015-12-20 22:26:12
【问题描述】：

编辑：修改代码和输出使其更清晰

编辑 2：添加了用于复制的示例输入

我有一个 JSON 文件和一个 CSV 文件，我正在对两者进行比较。问题是 $asset_ip 在外部 foreach 循环中被正确定义，但是在嵌套循环中 $asset_ip 变得未定义。

为什么 $asset_ip 变得未定义？

#!/usr/bin/perl
# perl -e'use CPAN; install "Text::CSV"'
use strict;
use warnings;

use JSON::XS;
use File::Slurp;
use Text::CSV;
my $csv = Text::CSV->new( { sep_char => ',' } );

my $csv_source  = "servers.csv";
my $json_source = "assets.json";
my $dest = "servers_for_upload.csv";

# defined these here as I need to use them in foreach loop and if statement:
my $csv_ip;
my @fields;
open( my $csv_fh, '<', $csv_source ) or die "$! error trying to read";
open( my $dest_fh, '>', $dest ) or die "$! error trying to read";

my $json = read_file($json_source);
my $json_array = decode_json $json;

foreach my $item (@$json_array) {
    my $id = $item->{id};
    my $asset_ip = $item->{interfaces}->[0]->{ip_addresses}->[0]->{value};

    # test the data is there:
    if ( defined $asset_ip ) {
        print "id: " . $id . "\nip: " . $asset_ip . "\n";
    }

    while (my $line = <$csv_fh>) {
        chomp $line;
        if ( $csv->parse($line) ) {
            @fields = $csv->fields();
            $csv_ip = $fields[0];
        }
        else {
            warn "Line could not be parsed: $line\n";
        }

            if ( $csv_ip eq $asset_ip ) {
                # preppend id to csv array and write these lines to new file
                unshift( @fields, $id );
                print $dest_fh join( ", ", @fields );    
        }
    }
}
close $csv_fh;

输出：

Use of uninitialized value $asset_ip in string eq at script.pl line 43, <$csv_fh> line 1.
Use of uninitialized value $asset_ip in string eq at script.pl line 43, <$csv_fh> line 2.
Use of uninitialized value $asset_ip in string eq at script.pl line 43, <$csv_fh> line 3. 
id: 1003
ip: 192.168.0.2
id: 1004
ip: 192.168.0.3
id: 1005
ip: 192.168.0.4

assets.json：

[{"id":1001,"interfaces":[]},{"id":1003,"interfaces":[{"ip_addresses":[{"value":"192.168.0.2"}]}]},{"id":1004,"interfaces":[{"ip_addresses":[{"value":"192.168.0.3"}]}]},{"id":1005,"interfaces":[{"ip_addresses":[{"value":"192.168.0.4"}]}]}]

请注意，对于第一次迭代，$asset_ip 将未定义。因此，如果定义了 $asset_ip，我将更改代码以仅运行 eq 比较。但是，对于这个示例，我没有进行检查，因为所有迭代都是未定义的。

servers.csv：

192.168.0.3,Brian,Germany
192.168.0.4,Billy,UK
192.168.0.5,Ben,UK

【问题讨论】：

与您的问题无关：die '$! error trying to read' 中应该有双引号，而不是单引号。照原样，如果发生这些错误，您将得到 $! 而不是变量的值。
抱歉 - 没有提到 - 我在运行脚本时收到 $asset_ip 未定义的错误。
引用错误消息（和行号）有助于诊断。一些示例源数据也是如此。大概这是您正在使用的JSON？ stackoverflow.com/questions/32737301/…您有一些示例 CSV 数据吗？
我认为没有必要包含样本数据。简单地说，$asset_ip 在外循环中被定义为一个 ip 地址，但是一旦在内循环中它是未定义的。第二个循环针对 CSV 文件中的行数迭代正确的次数。
没有样本数据，我们无法重现问题，因此只能猜测是哪条线路造成的，以及如何造成的。这就是我们通常请求minimal reproducible example 的原因 - 没有明确的原因为什么它变得未定义或 在哪里 它变得未定义。但是在阅读/重新阅读$csv_fh时存在明显的逻辑错误。

标签： json perl csv

【解决方案1】：

我认为你的问题是这样的：

foreach my $line (<$csv_fh>) {

您在我们的外部循环中执行此操作。但是当你这样做时，你的$csv_fh 会出现在文件的末尾。

完成此操作后，外层循环的后续迭代将不会执行此内层循环，因为从$csv_fh 中没有任何内容可供它读取。

如果这是您的问题，一个简单的测试是添加seek，例如seek ( $csv_fh, 0, 0 );.

但这不是一件有效的事情，因为这样您将多次循环文件 - 您应该将其读入数据结构并使用它。

编辑：这是你的问题：

[{"id":1001,"interfaces":[]},{"id":1003,"interfaces":[{"ip_addresses":[{"value":"192.168.0.2"}]}]},{"id":1004,"interfaces":[{"ip_addresses":[{"value":"192.168.0.3"}]}]},{"id":1005,"interfaces":[{"ip_addresses":[{"value":"192.168.0.4"}]}]}]

特别是：

[{"id":1001,"interfaces":[]}

您在该数组中的 first 元素没有定义 $asset_ip。

这意味着 - 在您第一次通过时 - $asset_ip 未定义并生成错误。（由于您的if defined 测试，没有打印任何行）。

然后 - 代码继续遍历 $csv_fh - 读取文件末尾 - 查找匹配项（失败 3 次，生成 3 条错误消息。

第二次迭代 - 对于 id 1002 - IP 无论如何都不在文件中，但 $csv_fh 已被读取到文件结尾 (EOF) - 因此 foreach 循环不会执行一点也不。

这可以通过以下方式可行：

在if defined之后添加else next;。
在 while 循环之后添加 seek。

但实际上 - 重写将是有序的，因此您不会一遍又一遍地重新读取文件。

非常粗略：

#!/usr/bin/perl
# perl -e'use CPAN; install "Text::CSV"'
use strict;
use warnings;

use JSON::XS;
use File::Slurp;
use Text::CSV;
my $csv = Text::CSV->new( { sep_char => ',' } );

my $csv_source  = "servers.csv";
my $json_source = "assets.json";
my $dest        = "servers_for_upload.csv";

# defined these here as I need to use them in foreach loop and if statement:
my $csv_ip;
my @fields;
open( my $csv_fh,  '<', $csv_source ) or die "$! error trying to read";
open( my $dest_fh, '>', $dest )       or die "$! error trying to read";

my $json       = read_file($json_source);
my $json_array = decode_json $json;

foreach my $item (@$json_array) {
    my $id       = $item->{id};
    my $asset_ip = $item->{interfaces}->[0]->{ip_addresses}->[0]->{value};

    # test the data is there:
    if ( defined $asset_ip ) {
        print "id: " . $id . "\nip: " . $asset_ip . "\n";
    }
    else {
        print "asset_ip undefined for id $id\n";
        next;
    }

    while ( my $line = <$csv_fh> ) {
        chomp $line;
        if ( $csv->parse($line) ) {
            @fields = $csv->fields();
            $csv_ip = $fields[0];
        }
        else {
            warn "Line could not be parsed: $line\n";
        }

        if ( $csv_ip eq $asset_ip ) {

            # preppend id to csv array and write these lines to new file
            unshift( @fields, $id );
            print {$dest_fh} join( ", ", @fields ),"\n";
        }
    }
    seek( $csv_fh, 0, 0 );
}
close $csv_fh;

我建议这也需要：

while 的更改，因此您不会每次都重新读取文件
您正在使用Text::CSV，因此使用print join ( ","... 似乎不是一个一致的选择。如果您的数据需要 Text::CSV，那么也值得保留它以供输出。

【讨论】：

我只是在编辑摘要中写了一个诙谐的评论... :(
我会想象一些有趣而有见地的东西。
一般来说，如果你必须多次迭代文件的内容，重新读取文件可能并不是一个坏主意，因为操作系统无论如何都会缓存数据。但在这种情况下，当然存在效率更高的单通道算法。
优秀的描述
@Ilmari Karonen：操作系统可能会缓存内容，可能 Perl 自己的 I/O 层会做一些缓冲，但是文件中有足够的数据，你会看到重复的 read() 系统调用，导致不必要的内核上下文切换...抱歉吹毛求疵。 ;-)