Perl + 递归子程序 + 访问子程序外定义的变量答案

【问题标题】：Perl + recursive subroutine + accessing variable defined outside of subroutinePerl + 递归子程序 + 访问子程序外定义的变量
【发布时间】：2015-05-02 13:32:19
【问题描述】：

我正在使用 Perl 提取 bitbucket 存储库列表。来自 bitbucket 的响应将仅包含 10 个存储库和一个标记，用于下一页将有另外 10 个存储库等等......（他们称之为分页响应）

所以，我编写了一个递归子程序，如果存在下一页标记，它会调用自己。这将持续到最后一页。

这是我的代码：

#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
use LWP::UserAgent;
use JSON;

my @array;

recursive("my_bitbucket_url");
foreach ( @array ) { print $_."\n"; }

sub recursive
{
    my $url    = $_[0];

    ### here goes my LWP::UserAgent code which connects to bitbucket and pulls back the response in a JSON as $response->decoded_content 
    ### hence, removing this code for brevity

    my $hash = decode_json $response->decoded_content;
    #print Dumper ($hash);

    foreach my $a ( @{$hash->{values}} )
    {
        push @array, $a->{links}->{self}->{href};
    }

    if ( defined $hash->{next})
    {
        print "Next page Exists \n";
        print "Recursing with $hash->{next} \n";
        recursive( $hash->{next} );
    }
    else
    {
        print "Last page reached. No more recursion \n"
    }
}

现在，我的代码工作正常，它列出了所有的 repos。

问题： 我不确定我使用上面的变量 my @array; 的方式。我已经在子程序之外定义了它，但是，我直接从子程序访问它。不知怎的，我觉得这是不对的。

那么，在这种情况下，如何使用递归子例程追加到数组。我的代码是遵守 Perl 道德规范还是真的很荒谬（但因为它有效而正确）？

更新

在遵循@ikegami、@Sobrique 和@Hynek -Pichi- Vychodil 的建议后，我提供了以下代码，它使用while 循环并避免recurssion。

这是我的思考过程：

定义一个数组@array。
使用初始位桶 URL 调用子例程 call_url 并将响应保存在 $hash
检查$hash 中的下一页 页面标记
- 如果存在 next 页面标记，则将元素推送到 @array 并使用新标记调用 call_url。这将通过while 循环完成。
- 如果 next 页面标记确实不存在，则将元素推送到 @array。期间。
打印@array内容。

这是我的代码：

my @array;
my $hash = call_url("my_bitbucket_url ");

if (defined $hash->{next})
{
    while (defined $hash->{next})
    {
        foreach my $a ( @{$hash->{values}} )
        {
            push @array, $a->{links}->{self}->{href};
        }
        $hash = call_url($hash->{next});
    }
}

foreach my $a ( @{$hash->{values}} )
{
    push @array, $a->{links}->{self}->{href};
}

foreach (@array) { print $_."\n"; }

sub call_url
{
    ### here goes my LWP::UserAgent code which connects to bitbucket and pulls back the response in a JSON as $response->decoded_content 
    ### hence, removing this code for brevity    

    my $hash = decode_json $response->decoded_content;
    #print Dumper ($hash);

    return $hash;
}

肯定想知道这看起来是否还可以，或者还有改进的余地。

【问题讨论】：

if defined 后跟 while defined 是多余的 - 如果条件相同，如果条件不成立，while 将绕过。
呃。 “哈希”和“数组”是变量的可怕名称。 “repo”或者更好的“current_repo”怎么样？

标签： perl recursion

【解决方案1】：

使用全局变量返回值演示了high coupling，这是要避免的。

您是在询问以下内容是否可以接受：

my $sum;
sum(4, 5);
print("$sum\n");
sub sum {
   my ($x, $y) = @_;
   $sum = $x + $y;
}

sub 是递归的这一事实是完全不相关的；它只会使您的示例更大。

问题已解决：

sub recursive
{
    my $url = $_[0];

    my @array;

    my $hash = ...;

    foreach my $a ( @{$hash->{values}} )
    {
        push @array, $a->{links}->{self}->{href};
    }

    if ( defined $hash->{next})
    {
        print "Next page Exists \n";
        print "Recursing with $hash->{next} \n";
        push @array, recursive( $hash->{next} );
    }
    else
    {
        print "Last page reached. No more recursion \n"
    }

    return @array;
}

{
    my @array = recursive("my_bitbucket_url");
    foreach ( @array ) { print $_."\n"; }
}

递归删除：

sub recursive
{
    my $url = $_[0];

    my @array;
    while (defined($url)) {    
        my $hash = ...;

        foreach my $a ( @{$hash->{values}} )
        {
            push @array, $a->{links}->{self}->{href};
        }

        $url = $hash->{next};

        if ( defined $url)
        {
            print "Next page Exists \n";
            print "Recursing with $url\n";
        }
        else
        {
            print "Last page reached. No more recursion \n"
        }
    }

    return @array;
}

{
    my @array = recursive("my_bitbucket_url");
    foreach ( @array ) { print $_."\n"; }
}

清理您发布的最新代码：

my $url = "my_bitbucket_url";

my @array;
while ($url) {
    my $hash = call_url($url);

    for my $value ( @{ $hash->{values} } ) {
       push @array, $value->{links}{self}{href};
    }

    $url = $hash->{next};
}

print("$_\n") for @array;

【讨论】：

同意你的最后一句话。感谢您提供解释我的问题的虚拟代码。因此，根据您的回复，这可行，但应该避免。那么，如何将每次调用子例程的值附加到数组中。不确定我的问题是否有点令人困惑，但是，正如您所指出的，应该避免使用此代码。那么，有没有更好的方法呢？
对sum 做同样的事情：将 var 移到 sub 中并使用 return 返回其内容
可能需要在 else 块中取消定义 $url。
@slayedbylucifer，if (defined $hash->{next}) 是不必要的，我会尝试删除重复的 foreach。但是有一个用于代码审查的姊妹网站。
@Sobrique，确实如此。固定。

【解决方案2】：

在任何闭包之外定义的变量可用于整个程序。它工作正常，没有什么可担心的。在某些情况下（主要是围绕节目长度和远距离动作），有些人可能会称其为“糟糕的风格”，但这并不是一个硬性限制。

我不确定我是否一定会在这里看到递归的优势 - 您的问题似乎不值得。这本身不是问题，但对于未来的维护程序员来说可能会有点混乱；）。

我会按照（非递归）的思路思考一些事情：

my $url = "my_bitbucket_url";

while ( defined $url ) {
    ##LWP Stuff;

    my $hash = decode_json $response->decoded_content;

    foreach my $element ( @{ $hash->{values} } ) {
        print join( "\n", @{ $element->{links}->{self}->{href} } ), "\n";
    }

    $url = $hash->{next}; #undef if it doesn't exist, so loop breaks. 
}

【讨论】：

我会考虑递归 subs 诸如树遍历之类的东西 - 例如跟踪网页中的每个链接。获取页面直到您停止获取“下一个”对我来说更像是一个while 循环。但如前所述 - 如果你想这样做，这不是一个大问题。
@slayedbylucifer，在满足条件之前做某事正是 while 循环所做的事情。 Sobrique 是对的；此处不应使用递归。
@ikegami：我看不出有理由不使用递归来完成这项工作。但它必须正确地完成。想象一下，有些语言根本没有 while 和其他循环，你也可以做任何事情。缺乏尾调用优化是 perl 的问题。
@ikegami：你确定性能下降吗？您是否将我的ricky_tail_recursion 与while 循环进行了比较？我强烈怀疑。
@Hynek -Pichi- Vychodil，是的，goto & 实际上让事情比正常通话慢，而且比没有通话要慢。最重要的是，子调用是较慢的 Perl 操作之一。

【解决方案3】：

是的，使用全局变量是一个坏习惯，即使它是词法范围的变量。

每个递归代码都可以重写为其命令式循环版本，反之亦然。这是因为所有这些都是在完全不了解递归的 CPU 上实现的。只有跳跃。所有调用和返回都只是带有一些堆栈操作的跳转，因此您可以将递归算法重写为循环。如果它不像在这种情况下那样明显和简单，您甚至可以模拟堆栈和行为，因为它在您最喜欢的语言解释器或编译器中完成。在这种情况下，它非常简单：

my @array = with_loop("my_bitbucket_url");
foreach ( @array ) { print $_."\n"; }

sub with_loop
{
    my $url    = $_[0];
    my @array;
    while(1) 
    {

    ### here goes my LWP::UserAgent code which connects to bitbucket and 
    ### pulls back the response in a JSON as $response->decoded_content 
    ### hence, removing this code for brevity

        my $hash = decode_json $response->decoded_content;
        #print Dumper ($hash);

        foreach my $a ( @{$hash->{values}} )
        {
            push @array, $a->{links}->{self}->{href};
        }

        unless ( defined $hash->{next})
        {
            print "Last page reached. No more recursion \n";
            last
        };

        print "Next page Exists \n";
        print "Recursing with $hash->{next} \n";
        $url = $hash->{next};
    };
    return @array;
}

但是当您想坚持使用递归时，您可以使用递归，但这有点棘手。首先，没有尾调用优化，因此您不必像原始版本那样尝试编写尾调用代码。所以你可以这样做：

my @array = recursion("my_bitbucket_url");
foreach ( @array ) { print $_."\n"; }

sub recursion
{
    my $url    = $_[0];

    ### here goes my LWP::UserAgent code which connects to bitbucket and 
    ### pulls back the response in a JSON as $response->decoded_content 
    ### hence, removing this code for brevity

    my $hash = decode_json $response->decoded_content;
    #print Dumper ($hash);

    # this map version is same as foreach with push but more perlish
    my @array = map $_->{links}->{self}->{href}, @{$hash->{values}};

    if (defined $hash->{next})
    {
        print "Next page Exists \n";
        print "Recursing with $hash->{next} \n";
        push @array, recursive( $hash->{next} );
    }
    else
    {
        print "Last page reached. No more recursion \n"
    }
    return @array;
}

但是这个版本效率不高，所以有办法在 perl 中编写尾调用递归版本，这有点棘手。

my @array = tail_recursive("my_bitbucket_url");
foreach ( @array ) { print $_."\n"; }

sub tail_recursive
{
    my $url    = $_[0];
    my @array;
    return tail_recursive_inner($url, \@array); 
    # url is mutable parameter
}

sub tail_recursive_inner
{
    my $url = $_[0];
    my $array = $_[1]; 
    # $array is reference to accumulator @array 
    # from tail_recursive function

   ### here goes my LWP::UserAgent code which connects to bitbucket and 
   ### pulls back the response in a JSON as $response->decoded_content 
   ### hence, removing this code for brevity

    my $hash = decode_json $response->decoded_content;
    #print Dumper ($hash);

    foreach my $a ( @{$hash->{values}} )
    {
        push @$array, $a->{links}->{self}->{href};
    }

    if (defined $hash->{next})
    {
        print "Next page Exists \n";
        print "Recursing with $hash->{next} \n";

        # first parameter is mutable so its OK to assign
        $_[0] = $hash->{next};
        goto &tail_recursive_inner;
    }
    else
    {
        print "Last page reached. No more recursion \n"
    }
    return @$array;
}

如果你对一些真正的 perl 技巧感兴趣

print $_."\n" for tricky_tail_recursion("my_bitbucket_url");

sub tricky_tail_recursion {
    my $url = shift;

   ### here goes my LWP::UserAgent code which connects to bitbucket and 
   ### pulls back the response in a JSON as $response->decoded_content 
   ### hence, removing this code for brevity

    my $hash = decode_json $response->decoded_content;
    #print Dumper ($hash);

    push @_, $_->{links}->{self}->{href} for @{$hash->{values}}; 

    if (defined $hash->{next}) {
        print "Next page Exists \n";
        print "Recursing with $hash->{next} \n";
        unshift @_, $hash->{next};
        goto &tricky_tail_recursion;
    } else {
        print "Last page reached. No more recursion \n"
    };
    return @_;
}

另请参阅：LWP::UserAgent docs。

【讨论】：

@ikegami：是的，我每秒可以进行 370 万次goto &code; 调用，即 270ns 与 32ns while 循环。使用LWP 真的让我很困扰。对不起，你让我笑了。是的，在这种特殊情况下，它的可读性不如 while 循环，但这主要是 Perl 问题。没有任何理由一开始应该很慢，然后再复杂。
你是提高效率的人。效率不是我给出的解决方案不好的原因。您添加的额外复杂性如何使您的代码变慢？这会影响可读性、可维护性等。您提供的建议很糟糕。
我认为关于递归的讨论可能很有趣，但可能不在 cmets 中:)。我敢肯定有一个问题可以提出......
@ikegami：“这是一个很大的性能汇。”是的，是我提高了效率:-)
@ikegami: "这里绝对不能使用递归。" 原始代码是用递归编写的。也许是因为对于代码的原始作者来说，这是处理它的自然方式。所以重写命令式while循环是强制解决问题的另一种不自然的方式。因此，向我展示如何在递归中更好地编写它，但又不会在 Perl 中增加堆栈。当您坚持“您提供的建议不好”时，这对您来说应该不是问题。