在c#中出现异常后继续答案

【问题标题】：Continue after an exception in c#在c#中出现异常后继续
【发布时间】：2024-01-20 16:34:01
【问题描述】：

我正在尝试获取大约的文档类型。 3k 链接。但是当它到达 700-900 标记线时，我总是会遇到异常。

我怎样才能在异常发生的地方继续（所以我没有义务再次从零开始）？这可能吗？

这是我使用的代码：

     try
        {
            Parallel.ForEach(linkList, link => 
            {
                stopwatch.Restart();
                Console.Write($"Downloading page {index++} of {linkList.Count}...");
                documents.Add(LoadPage(link));
                Console.Write($" in {stopwatch.Elapsed.TotalMilliseconds} ms");
                Console.WriteLine();
            });

            return documents;
        }
        catch (Exception e)
        {
            ???
        }

【问题讨论】：

你有没有想过将 try-catch 移动到 inside 并行循环中？

标签： c# exception web-scraping try-catch parallel.foreach

【解决方案1】：

尝试将内部代码包装在 try-catch 中

        Parallel.ForEach(linkList, link => 
        {
            try
            {
                stopwatch.Restart();
                Console.Write($"Downloading page {index++} of {linkList.Count}...");
                documents.Add(LoadPage(link));
                Console.Write($" in {stopwatch.Elapsed.TotalMilliseconds} ms");
                Console.WriteLine();
            }
            catch (Exception e)
            {
                ???
            }
        });

        return documents;

编辑：

您可能还想查看 C# 必须提供的 thread-safe collections，因为普通集合不是线程安全的

【讨论】：

【解决方案2】：

您只需要在ForEach 中处理它们

Parallel.ForEach(linkList, link => 
{
    try
    {
       ...
     }
     catch(Exception ex)
     {
        // log
     }
});

但是你有更多的问题。

这看起来像是 IO 绑定的工作负载，不适合 Parallel.ForEach
documents.Add 看起来不是线程安全的
您的索引将失效

老实说，这看起来确实像 TPL 数据流 的工作，它让您能够很好地使用 async 和 await 以及 IO 绑定的工作负载。使用 async 和 await，将停止破坏任务调度程序，让 IO 完成端口完成其工作以释放线程池。

它还可以让您创建更复杂的管道，并能够在需要时将失败的作业重新反馈给它自己，以及许多其他优势

【讨论】：

实际上我尝试使用异步等待，但它没有奏效。之后我尝试将链接列表拆分为更小的列表，并再次尝试使用 thread.sleep 和 await+ async 再次尝试......再次失败。
@Sayit 是的，这需要一个战略性的改变和一个小的学习曲线，但它会工作得更好。然而，这值得另一个问题。更多信息

【解决方案3】：

好的，伙计们，这就是引导我实现目标的解决方案。

        var index = 1;
        Parallel.ForEach(linkList,  link => { GetDocuments(stopwatch, index++, linkList, documents, link); });

        if (FailedDownloads.Count > 0)
        {
            linkList = new List<string>(FailedDownloads);
            FailedDownloads.Clear();
            Parallel.ForEach(linkList,
                link => { GetDocuments(stopwatch, index++, linkList, documents, link); });
        }
        return documents;
    }

    private void GetDocuments(Stopwatch stopwatch, int index, List<string> linkList, List<HtmlDocument> documents, string link)
    {
        stopwatch.Restart();
        Console.Write($"Downloading page {index} of {linkList.Count}...");
        try
        {
            documents.Add(LoadPage(link));
            Console.Write($" in {stopwatch.Elapsed.TotalMilliseconds} ms");
        }
        catch (AggregateException e)
        {
            if (e.InnerExceptions[0] is HttpRequestException)
            {
                FailedDownloads.Add(link);
                Console.WriteLine(e);
            }
            else
            {
                throw;
            }
        }

【讨论】：