【问题标题】:How to do smoething when all the async functions completed which are called through for loop?当通过 for 循环调用的所有异步函数完成时,如何做某事?
【发布时间】:2021-02-24 15:42:14
【问题描述】:

所以我正在使用puppeteer,我只是在多个选项卡中并行抓取页面,并使用我使用的 for 循环的相同 URL 打开多个选项卡,如下所示:

const startScraping = async (url) => {

    for (let i of MyArray) {
        const page = await browser.newPage();

        page.goto(url).then(() => {
            scrapePage(page); // This is the function where I am scraping through this page. and 
                             // This is also a async function
        });
    }
 
    return new Promise((resolve, reject) => {

        resolve("Done");
        reject("Error");

    });

}

startScraping(url).then((data) => {

  console.log(data);

})

但问题是在循环之后立即返回承诺,但我想要的是应该在所有页面都抓取之后返回这个promise

谁能帮帮我?

PS: scrapePage() is also a async function

提前致谢。

仅用于解释场景:

async function func() {
    setTimeout(() => {
        return "Done";
    }, 3000);
}

async function scrapeSingle(url) {
    return [url, await func()];
}

let myArray = [1, 2, 3, 4, 5];

const parallelScrapes = myArray.map((url) => scrapeSingle(url));
Promise.all(parallelScrapes).then((data) => {
    console.log(data);
});

在这里,我想在 3 秒后打印 [[1, "Done"], [2, "Done"], [3, "Done"], [4, "Done"], [5, "Done"]],但它正在立即打印 [[ 1, undefined ], [ 2, undefined ], [ 3, undefined ], [ 4, undefined ], [ 5, undefined ]]

【问题讨论】:

    标签: javascript promise async-await puppeteer es6-promise


    【解决方案1】:

    您正在混合和匹配 asyncthen 甚至 new Promise()

    串行解决方案是

    const startScraping = async (url) => {
      const data = [];
      for (let i of MyArray) {
        const page = await browser.newPage();
        await page.goto(url);
        const result = await scrapePage(page);
        data.push([i, result]);
      }
      return data;
    };
    
    startScraping(url).then((data) => {
      console.log(data);
    });
    

    要并行处理myArray 中的所有网址,您需要使用Promise.all()

    async function scrapeSingle(browser, url) {
      const page = await browser.newPage();
      await page.goto(url);
      return [url, await scrapePage(page)];
    }
    
    const parallelScrapes = myArray.map((url) =>
      scrapeSingle(browser, url),
    );
    Promise.all(parallelScrapes).then((data) => {
      console.log(data);
    });
    

    【讨论】:

    • 感谢您的帮助,但这不是我想要的。只要打开所有选项卡,就会在这里调用console.log(data)。但是我想要的应该是从所有页面完成所有报废。
    • 我的错,请稍等...好的,添加了缺少的await :)
    • 即使这样也行不通。请考虑上述问题中的“仅用于解释场景”,这就是正在发生的事情。
    【解决方案2】:

    这行得通。

    const startScraping = async (url) => {
    
        let tasks = [];
    
        for (let i of MyArray) {
            const page = await browser.newPage();
    
            await page.goto(url);
    
            tasks.push(scrapePage(page))
        }
    
        await Promise.all(tasks);
     
        return new Promise((resolve, reject) => {
    
            resolve("Done");
            reject("Error");
    
        });
    
    }
    

    【讨论】:

      猜你喜欢
      • 2014-12-14
      • 1970-01-01
      • 1970-01-01
      • 2020-09-07
      • 1970-01-01
      • 2014-07-24
      • 2018-05-07
      • 1970-01-01
      • 2014-05-12
      相关资源
      最近更新 更多