erlang 主管处理 ibrowse 的最佳方法：send_req conn_failed答案

【问题标题】：erlang supervisor best way to handle ibrowse:send_req conn_failederlang 主管处理 ibrowse 的最佳方法：send_req conn_failed
【发布时间】：2012-06-22 02:46:20
【问题描述】：

刚接触 Erlang，只是有点难以理解新范式！

好的，所以我在 OTP gen_server 中有这个内部函数：

my_func() ->
Result = ibrowse:send_req(?ROOTPAGE,[{"User-Agent",?USERAGENT}],get),
case Result of
    {ok, "200", _, Xml} -> %<<do some stuff that won't interest you>>
,ok;
{error,{conn_failed,{error,nxdomain}}} -> <<what the heck do I do here?>>
end.

如果我忽略了处理连接失败的情况，那么我会收到一个传播到主管的退出信号，它会与服务器一起关闭。

我想要发生的事情（至少我认为这是我想要发生的事情）是在连接失败时我想暂停然后重试 send_req 说 10 次，此时主管可能会失败。

如果我做了这样丑陋的事情......

{error,{conn_failed,{error,nxdomain}}} -> stop()

它会关闭服务器进程，是的，我可以使用我的（在 10 秒内尝试 10 次）重启策略直到它失败，这也是期望的结果，但是从服务器到主管的返回值是 'ok ' 当我真的想返回 {error,error_but_please_dont_fall_over_mr_supervisor} 时。

我强烈怀疑在这种情况下，我应该处理所有业务问题，例如在“my_func”中重试失败的连接，而不是试图让进程停止然后让主管重新启动它以便再次尝试.

问题：在这种情况下，“Erlang 方式”是什么？

【问题讨论】：

标签： erlang erlang-otp erlang-supervisor

【解决方案1】：

我也是 erlang 的新手.. 但是像这样的东西怎么样？

代码很长只是因为 cmets。我的解决方案（我希望我已经正确理解了您的问题）将接收最大尝试次数，然后进行尾递归调用，这将通过将最大尝试次数与下一次进行模式匹配来停止。使用 timer:sleep() 暂停以简化事情。

%% @doc Instead of having my_func/0, you have
%% my_func/1, so we can "inject" the max number of
%% attempts. This one will call your tail-recursive
%% one
my_func(MaxAttempts) ->
    my_func(MaxAttempts, 0).

%% @doc This one will match when the maximum number
%% of attempts have been reached, terminates the
%% tail recursion.
my_func(MaxAttempts, MaxAttempts) ->
    {error, too_many_retries};

%% @doc Here's where we do the work, by having
%% an accumulator that is incremented with each
%% failed attempt.
my_func(MaxAttempts, Counter) ->
    io:format("Attempt #~B~n", [Counter]),
    % Simulating the error here.
    Result = {error,{conn_failed,{error,nxdomain}}},
    case Result of
        {ok, "200", _, Xml} -> ok;
        {error,{conn_failed,{error,nxdomain}}} ->
            % Wait, then tail-recursive call.
            timer:sleep(1000),
            my_func(MaxAttempts, Counter + 1)
    end.

编辑：如果此代码处于受监督的进程中，我认为最好有一个 simple_one_for_one，您可以在其中动态添加所需的任何工作人员，这是为了避免由于超时而延迟初始化（在 one_for_one 工作人员按顺序启动，并且此时进入睡眠状态将阻止其他进程初始化）。

EDIT2：添加了一个示例 shell 执行：

1> c(my_func).
my_func.erl:26: Warning: variable 'Xml' is unused
{ok,my_func}
2> my_func:my_func(5).
Attempt #0
Attempt #1
Attempt #2
Attempt #3
Attempt #4
{error,too_many_retries}

每条打印消息之间有 1 秒的延迟。

【讨论】：

是的，我尝试了类似的方法（比您的解决方案稍微不那么优雅），传入尝试次数，然后递减并测试为零。唯一的区别是我调用了 stop() 而不是返回一个元组，stop 将关闭服务器（并重新启动）但不返回有用的消息，{error,too_many_retries} 将返回有用的消息但不关闭服务器。
我希望找到一个解决方案来获得两个世界的结合，因为 send_req 错误出现我重试 10 次，然后如果它仍然失败，我抓住它然后再次抛出它，但是在这样一个避免导致主管失败的方式 - 但是我也认为这不是“Erlang方式”。我认为您的解决方案是可行的方法，因为如果发生任何其他异常，则应该由主管的重启策略涵盖。如果其他人有不同的方法或意见，那么一定要把它通过。非常感谢
@unclejimbob 据我了解，您想让主管仅在达到最大尝试次数时才失败，对吗？然后让主管重新启动工作人员，以便从顶部重试整个操作。这就是我试图用代码做的事情。您可能想用 erlang:error(too_many_attempts) 来补充它。因此，您只想在达到最大尝试次数或抛出未知错误（未捕获）时“让它崩溃”。我不喜欢 timer:sleep() 是您的工作人员不遵守 sys 消息，因此另一种解决方案是接收超时
或者可能使用 gen_server 和超时......有人吗？ :)