两个数组perl的比较答案

【问题标题】：Comparison of two arrays perl两个数组perl的比较
【发布时间】：2013-01-09 12:17:04
【问题描述】：

我是 Perl 世界的新手，我有一个比较两个数组的脚本。

我使用List::MoreUtils (each_arrayref) 进行比较。

我有两个问题：

1) 有没有一种方法可以比较两个数组块（例如 natatime，但对于两个 arrayrefs），而不是像 each_arrayref 那样一次比较单个元素？

元素应该来自每个数组的相同索引。

数据结构是这样的：

{
  atr => [qw/ a b c d /],
  ats => [qw/ a b c d /],
  att => [qw/ a b c d /],
}

这是我到目前为止所得到的。

my @lists = keys %{$hash};

for (my $i = 0; $i <= @lists; $i++) {

  my $list_one = $lists[$i];
  my $one = $hash->{$list_one};

  for (my $j = 0 ; $j <= @lists ; $j++) {

    my $list_two = $lists[$j];
    my $two = $hash->{$list_two};

    my ($overlapping, $mismatch, $identity);
    my $match          = 0;
    my $non_match      = 0;
    my $count_ac_calls = 0;
    my $each_array     = each_arrayref($one, $two);

    while (my ($call_one, $call_two) = $each_array->()) {

      if ((defined $call_one) && (defined $call_two)) {
        if ($call_one eq $call_two) {
          $match++;
        }
        if ($call_one ne $call_two) {
          $non_match++;
        }
      }
    }    #end of while loop $each_array->()

    print "$list_one,$list_two,$match,$non_match";

  }    #end of for j loop
}    #end of for i loop

我想比较 atr->ats、atr->att、ats->att。但是使用我当前的代码，我会重复进行比较，例如 ats->atr att->atr,att->ats。

2) 我怎样才能避免这些？

【问题讨论】：

看看stackoverflow.com/q/2591747、stackoverflow.com/q/1609467和perlmonks.org/?node_id=153402
我不太明白这个问题...

标签： arrays perl

【解决方案1】：

我不清楚你的第一个问题是什么意思。你想要一个迭代器，比如说，返回(('a','b','c'),('a','b','c')) 代替 ('a','a')?如果是这样，那么库中没有可用的库，但编写自己的库并不难。

至于第二个，避免项目与自身进行比较的通常方法是更改内部循环，使其在第一个的当前值之后开始。像这样

for my $i (0..$#lists) {

  for my $j ($i+1..$#lists) {

  }

}

这是可行的，因为A eq B 通常与B eq A 相同，因此没有必要将条目与列表中较早的条目进行比较，因为已经进行了反向比较。

请注意，以这种方式编写for 循环的 Perl 比杂乱的 C 样式语法要好得多。你也有几个错误

for (my $i = 0 ; $i <= @lists ; $i++) { ... }

因为@lists 的最大索引比@lists 的标量值小1 - 通常编码为$#lists。 $j 的循环中存在同样的问题。

更新

这是对您的程序的重构，它包含我所描述的想法，并且更加Perlish。希望对你有用。

use strict;
use warnings;

use List::MoreUtils 'each_arrayref';

my $hash = {
  atr => [qw/ a b c d /],
  ats => [qw/ a b c d /],
  att => [qw/ a b c d /],
};

my @keys = keys %{$hash};

for my $i (0 .. $#keys) {

  my $key1 = $keys[$i];
  my $list1 = $hash->{$key1};

  for my $j ($i+1 .. $#keys) {

    my $key2 = $keys[$j];
    my $list2 = $hash->{$key2};

    my ($match, $non_match) = (0, 0);
    my $iter = each_arrayref($list1, $list2);

    while (my ($call1, $call2) = $iter->()) {
      if (defined $call1 and defined $call2) {
        ($call1 eq $call2 ? $match : $non_match)++;
      }
    }

    print "$key1, $key2, $match, $non_match\n";
  }
}

【讨论】：

<= $#lists 或 < @lists
谢谢鲍罗丁。这看起来比我得到的更整洁。关于我的第一个问题，我想将 atr 的 abcd 与 ats 的 abcd 进行比较，而不是 a->a, b->b,c->c,d->d。

【解决方案2】：

一种选择是使用Array::Compare 返回不同数组元素的数量。此外，Math::Combinatorics 仅用于获取唯一比较。

use strict;
use warnings;
use Array::Compare;
use Math::Combinatorics;

my %hash = (
    'atr' => [ 'a', 'b', 'c', 'd' ],
    'ats' => [ 'a', 'b', 'c', 'd' ],
    'att' => [ 'a', 'c', 'c', 'd' ],
);

my $comp = Array::Compare->new( DefFull => 1 );
my $combinat = Math::Combinatorics->new(
    count => 2,
    data  => [ keys %hash ],
);

while ( my ($key1, $key2) = $combinat->next_combination ) {
    my $diff = $comp->compare( \@{ $hash{$key1} }, \@{ $hash{$key2} } );
    print "$key1,$key2," . ( @{ $hash{$key1} } - $diff ) . ",$diff\n";
}

输出：

ats,att,3,1
ats,atr,4,0
att,atr,3,1

【讨论】：

这仅在真实数据都是单字符串的极少数情况下才有效。
@Borodin - 是的，很好。我可能太从字面上理解了 OP 的“数据结构是这样的”。最初使用 Array::Compare;将使用完整比较来恢复它。谢谢。

【解决方案3】：

您并没有真正利用 Perl 提供的功能。与其使用容易出错的 C 风格循环，不如使用for my $var (LIST)。您也可以通过跳过自检来跳过冗余列表检查。我拿走了你的剧本，做了一些改动，我相信你会发现它更容易阅读。

use v5.16;
use warnings;
use List::MoreUtils qw{each_arrayref};

my $hash = {
  'atr' => [
    'a',
    'b',
    'c',
    'd'
   ],
  'ats'=>[
    'a',
    'b',
    'c',
    'd'
   ],
  'att' => [
    'a',
    'c',
    'c',
    'd'
   ],
};

for my $list_one (keys $hash) {
    my $one = $hash->{$list_one};

    for my $list_two (keys $hash) {
        next if $list_one ~~ $list_two;

        my $two = $hash->{$list_two};

        my ($match, $non_match);
        $match = $non_match = 0;

        my $each_array = each_arrayref($one, $two);
        while (my ($call_one, $call_two) = $each_array->()) {
            if($call_one && $call_two) {
                if($call_one eq $call_two) {
                    $match++;
                }
                else {
                    $non_match++;
                }
            }
        }

        print "$list_one,$list_two,$match,$non_match\n";
    }
}

无论如何，您都希望一次评估一个，以便您可以添加一些额外的位，例如索引位置。（是的，您可以使用 C 样式的循环，但这样会更难阅读。）

【讨论】：

while (my ($list_one, $one) = each %$hash){...} 更好。
@amon：这对我不起作用。脚本陷入无限循环。
啊，对不起，是的，我忘了每个数据结构只能有一个迭代器：/
是的，我正要说我知道为什么感谢友好的手册：perldoc.perl.org/functions/each.html
这段代码将atr与ats以及ats与atr等进行比较。不需要use v5.16，因为程序中没有v5.8赢得的任何内容不处理。没有理由使用$list_one ~~ $list_two 而不是$list_one eq $list_two。如果任何数据可以是'0' 或''，则$call_one && $call_two 不是defined $call_one && defined $call_two 的有效替代品。