Node.js 的异步调用和递归答案

【问题标题】：Asynchronous Calls and Recursion with Node.jsNode.js 的异步调用和递归
【发布时间】：2014-12-18 09:18:50
【问题描述】：

我希望在递归函数完全完成后执行回调，该函数可以持续不确定的时间。我正在努力解决异步问题，并希望在这里得到一些帮助。使用request模块的代码如下：

var start = function(callback) {
  request.get({
    url: 'aaa.com'
  }, function (error, response, body) {
    var startingPlace = JSON.parse(body).id;
    recurse(startingPlace, callback);
  });
};

var recurse = function(startingPlace, callback) {
    request.get({
        url: 'bbb'
    }, function(error, response, body) {
        // store body somewhere outside these funtions
        // make second request
        request.get({
            url: 'ccc'
        }, function(error, response, body) {
            var anArray = JSON.parse(body).stuff;
            if (anArray) {
                anArray.forEach(function(thing) {
                    request.get({
                        url: 'ddd'
                    }, function(error, response, body) {
                        var nextPlace = JSON.parse(body).place;
                        recurse(nextPlace);
                    });
                })
            }
        });
    });
    callback();
};

start(function() {
    // calls final function to print out results from storage that gets updated each     recursive call
    finalFunction();
});

似乎一旦我的代码通过嵌套请求中的for 循环，它就会继续超出请求并结束初始函数调用，而递归调用仍在继续。我希望它在所有嵌套的递归调用都完成之前不会完成最高级别的迭代（我无法知道有多少）。

非常感谢任何帮助！

【问题讨论】：

你需要给东西命名，你需要一个围绕所示部分的外部包装函数，你可以收集结果/监控进度。
你需要解决很多问题。（1）将recurse(nextPlace);更改为recurse(nextPlace, callback);（2）如果它不是您的测试中的数组，则递归停止，因此，在if(anArray) {....}之后写下callback();（3）完全删除callback();您在最底部 (4) 在 cmets 部分中，如果您让任何异步发生或执行停止，您应该适当地传递或调用（这次使用return callback();）回调函数。你已经准备好了。无论如何，请执行 (1) - (3)，您会告诉我们发生了什么。
我只需要在所有递归循环完成后调用最终函数。这将在每个循环完成后调用它。我的第二个问题，也是更大的问题，是如何使 HTTP 请求阻塞——现在，一旦它们被调用，循环就会继续，而不是等待它们完成。这确实是造成问题的原因。
你是对的，我错了。然后，当您遍历数组时，您应该使用异步，例如 async.parallel。例如，请参见此处：stackoverflow.com/q/26431257/1355058 我还想将不同的参数传递给所有并行发生的调用。现在，你准备好了。 :)
我会更详细地查看异步模块。是否有一个特定的函数可以很好地用于递归？我需要最高级别的for 循环才能最后完成，所以当我运行回调时（我可以将条件传递给它以测试它是否等于startingPlace），我来对了吗？跨度>

标签： javascript node.js asynchronous recursion

【解决方案1】：

在您的示例中，您没有递归调用。如果我理解正确，您想说recurse(point, otherFunc); 是递归调用的开始。

然后回到递归调用的定义（您的帖子中没有显示）并执行此操作（为要在递归结束时调用的回调函数添加第三个参数；调用者将传递它作为参数）：

function recurse(startingPlace, otherFunc, callback_one) {
    // code you may have ...
    if (your_terminating_criterion === true) {
         return callback_one(val); // where val is potentially some value you want to return (or a json object with results)
    }
    // more code you may have
}

然后在您发布的原始代码中，改为进行此调用（在最里面的部分）：

recurse(startingPlace, otherFunc, function (results) {
    // results is now a variable with the data returned at the end of recursion
    console.log ("Recursion finished with results " + results);
    callback();   // the callback that you wanted to call right from the beginning
});

花点时间试着理解我的解释。当您了解时，您将了解节点。这是一篇文章中的节点哲学。我希望很清楚。您的第一个示例应如下所示：

var start = function(callback) {
  request.get({
    url: 'aaa.com'
  }, function (error, response, body) {
    var startingPlace = JSON.parse(body).id;
    recurse(startingPlace, otherFunc, function (results) {
        console.log ("Recursion finished with results " + results);
        callback();
    });
  });
};

以下仅是您感兴趣的附加信息。否则，您将设置为上述内容。

通常在 node.js 中，人们也会返回一个错误值，以便调用者知道被调用的函数是否已成功完成。这里没有什么大谜团。而不是只返回results，人们拨打表单

return callback_one(null, val);

然后在其他功能中你可以拥有：

recurse(startingPlace, otherFunc, function (recError, results) {
    if (recErr) {
         // treat the error from recursion
         return callback(); // important: use return, otherwise you will keep on executing whatever is there after the if part when the callback ends ;)
    }

    // No problems/errors
    console.log ("Recursion finished with results " + results);
    callback();   // writing down `return callback();` is not a bad habit when you want to stop execution there and actually call the callback()
});

更新我的建议

这是我对递归函数的建议，但在此之前，您似乎需要定义自己的get：

function myGet (a, callback) {
    request.get(a, function (error, response, body) {
        var nextPlace = JSON.parse(body).place;
        return callback(null, nextPlace); // null for no errors, and return the nextPlace to async
    });
}

var recurse = function(startingPlace, callback2) {
    request.get({
        url: 'bbb'
    }, function(error1, response1, body1) {
        // store body somewhere outside these funtions
        // make second request
        request.get({
            url: 'ccc'
        }, function(error2, response2, body2) {
            var anArray = JSON.parse(body2).stuff;
            if (anArray) {
                // The function that you want to call for each element of the array is `get`.
                // So, prepare these calls, but you also need to pass different arguments
                // and this is where `bind` comes into the picture and the link that I gave earlier.
                var theParallelCalls = [];
                for (var i = 0; i < anArray.length; i++) {
                    theParallelCalls.push(myGet.bind(null, {url: 'ddd'})); // Here, during the execution, parallel will pass its own callback as third argument of `myGet`; this is why we have callback and callback2 in the code
                }
                // Now perform the parallel calls:
                async.parallel(theParallelCalls, function (error3, results) {
                    // All the parallel calls have returned
                    for (var i = 0; i < results.length; i++) {
                        var nextPlace = results[i];
                        recurse(nextPlace, callback2);
                    }
                });
            } else {
                return callback2(null);
            }
        });
    });
};

请注意，我假设get 对“bbb”的请求总是跟在get 对“ccc”的请求之后。换句话说，您没有为拥有 cmets 的递归调用隐藏返回点。

【讨论】：

谢谢这让事情更清楚，但仍然不能完全解决问题。它不是一个单一的if/else - 即使数组不存在，它也必须在完成之前遍历所有数组。例如，如果startingPlace 离它有三个位置，那么即使第一和第二位置是空的，它仍然必须经过第三位置。这会在它第一次到达一个空的地方时调用回调。
好的，我现在明白为什么你在开始的地方有回调了。在递归的一开始，您将需要另一个异步来处理三个 get 请求。它们中的每一个都将返回一个结果（可能为空），但它们三个都将被执行。只有这样你才想调用回调，而不是我现在拥有的地方。我假设您想按顺序调用这些函数。然后尝试使用类似瀑布的异步变体，您还可以将一个函数的结果传递给下一个函数。
其实我越觉得你的评论越不知道你到底想干什么。 recurse 在第三步中执行两个简单的获取，并且可能执行很多。如果结果数组中的一个点为空，则不要进行递归调用。最后一个选项（最有可能）是使用结果数组作为另一个 async.parallel 调用的起点，但这次是递归（不是 myGet）。然后，当所有这些都返回时，你就有了语句'return callback2();'并完全消除 else 。有意义吗？
它非常类似于一棵树。每个地方都有x 数量的下一个地方。我必须遍历每个地方，为每个下一个地方发出GET 请求，等等。最后，一旦所有GET 请求完成，我必须进行最后的函数调用。回调仅在收到最后一个响应后执行，但没有关于该停止点在哪里的信息。我使用async.parallel 遵循了您的代码，但仍然没有运气。我觉得这里缺少一些非常基本的东西。我通常不会遇到回调和异步问题。
终于能够使用不同的async 方法解决这个问题 - 感谢您推荐该模块！

【解决方案2】：

通常，当您编写递归函数时，它会执行某事，然后调用自身或返回。

您需要在递归函数的范围内定义callback（即recurse 而不是start），并且您需要在通常返回的位置调用它。

因此，假设示例如下所示：

get_all_pages(callback, page) {
    page = page || 1;
    request.get({
        url: "http://example.com/getPage.php",
        data: { page_number: 1 },
        success: function (data) {
           if (data.is_last_page) {
               // We are at the end so we call the callback
               callback(page);
           } else {
               // We are not at the end so we recurse
               get_all_pages(callback, page + 1);
           }
        }
    }
}

function show_page_count(data) {
    alert(data);
}

get_all_pages(show_page_count);

【讨论】：

这对我来说是最好最简洁的答案。谢谢@Quentin。

【解决方案3】：

我想您可能会发现caolan/async 很有用。特别关注async.waterfall。它将允许您从另一个回调传递结果，并在完成后对结果执行某些操作。

例子：

async.waterfall([
    function(cb) {
        request.get({
            url: 'aaa.com'
        }, function(err, res, body) {
            if(err) {
                return cb(err);
            }

            cb(null, JSON.parse(body).id);
        });
    },
    function(id, cb) {
        // do that otherFunc now
        // ...
        cb(); // remember to pass result here
    }
], function (err, result) {
   // do something with possible error and result now
});

【讨论】：

【解决方案4】：

如果你的递归函数是同步的，只需在下一行调用回调：

var start = function(callback) {
  request.get({
    url: 'aaa.com'
  }, function (error, response, body) {
    var startingPlace = JSON.parse(body).id;
    recurse(startingPlace, otherFunc);
    // Call output function AFTER recursion has completed
    callback();
  });
};

否则，您需要在递归函数中保留对回调的引用。

将回调作为参数传递给函数，并在完成时调用它。

var start = function(callback) {
  request.get({
    url: 'aaa.com'
  }, function (error, response, body) {
    var startingPlace = JSON.parse(body).id;
    recurse(startingPlace, otherFunc, callback);
  });
};

【讨论】：

这让我更接近了，但我仍然无法确定递归调用何时准确结束。我没有设定结束的标准（我正在遍历类似于具有未指定节点集的树的东西），所以我不确定何时在递归函数中调用回调。不过，感谢您朝着正确的方向迈出这一步。
这里是完整的代码集。我仍然遇到递归问题，在调用最终函数之前，我无法让程序等待所有 for 循环和递归完成。使用 setTimeout，一切正常，所以这绝对是一个异步问题，但显然我不能走那条路。我已经更新了上面的代码以获得更多帮助。

【解决方案5】：

从此示例构建您的代码：

var udpate = function (callback){
    //Do stuff
    callback(null);
}

function doUpdate() {
    update(updateDone)
}

function updateDone(err) {
    if (err)
        throw err;
    else
        doUpdate()
}

doUpdate();

【讨论】：

【解决方案6】：

使用 ES6、'es6-deferred' 和 'q'。您可以尝试如下，

var Q = require('q');
var Deferred = require('es6-deferred');

const process = (id) => {
    var request = new Deferred();

    const ids =//do something and get the data;
    const subPromises = ids.map(id => process(id));

    Q.all(subPromises).then(function () {
        request.resolve();
    })
    .catch(error => {
        console.log(error);
    });

    return request.promise
}

process("testId").then(() => {
    console.log("done");
});

【讨论】：