【问题标题】:httpclient throw exception on redirecthttpclient在重定向时抛出异常
【发布时间】:2017-12-17 01:38:12
【问题描述】:

我正在尝试使用 HttpClient 下载网页,这是我的代码:

private async Task<string> _doRequest(string url)
{
  string result = string.Empty;

  var client = HttpClient;
  using(var request = new HttpRequestMessage()
  {
    RequestUri = new Uri(url),
    Method = HttpMethod.Get
  }){
    using (HttpResponseMessage response = client.SendAsync(request).Result)
      if (response.Headers.Location == null)
      {
        using (HttpContent content = response.Content)
        {
          result = await content.ReadAsStringAsync();
        }
      }
      else
      {
        result = await _doRequest(response.Headers.Location.ToString());
      }
  };

  return result;
}

HttpClient是一个静态变量,初始化如下:

  var handler = new HttpClientHandler();
  handler.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
  handler.AllowAutoRedirect = false;
  HttpClient = new HttpClient(handler);
  HttpClient.DefaultRequestHeaders.Add("User-Agent", @"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36");

当我尝试使用 url = "https://www.gls-italy.com/?option=com_gls&amp;view=track_e_trace&amp;mode=search&amp;numero_spedizione=TE170187747&amp;tipo_codice=nazionale" 执行代码时

我得到以下信息:

这就是我尝试使用 curl 的原因:

我在这里迷路了。对我来说,它看起来像是一个有效的 302 位置,但由于某些原因 HttpClient 不这么认为,只是抛出一个异常。

首先要明确的是,我最初依赖 AllowAutoRedirect 默认值并推动 HttpClient 进行重定向,但它不起作用,我遇到了同样的异常,这导致我尝试自己管理它。但没有成功。

有人知道发生了什么吗?如何让它发挥作用?

提前致谢。

【问题讨论】:

  • 不是 HttpClient。我已经尝试过您与 HttpWebRequest 的链接,但出现了 WebException.SendFailure。这是一个糟糕的 302 响应。连接已关闭。位置声明新地址是https://wwwdr.(...)。如果您将www 更改为wwwdr,则服务器会正​​确响应。
  • 不回答您的问题,而是要避免在异步方法中使用 .Result 的提示。它可能导致死锁。如果您等待 client.SendAsync(request),您将获得解包结果,避免死锁情况,并可能从您的线程中获得更多使用。
  • @Jimi 我知道如果我请求 wwwdr 它正在工作但他们可以改变它,我需要一种方法来遵循重定向,curl 确实遵循重定向必须有一种方法来从 c# 管理它.
  • @您可能没有明确地将协议设置为 TLS 1.2。此服务器仅使用它。所以在创建HttpRequest之前设置ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12
  • @Jimi 我之前试过不会改变结果,这很有意义,因为我在 4.6.2 网络上。我认为问题在于连接:关闭响应。即使有连接,也必须有一种使用 HttpClient 读取响应的方法:close...

标签: c# redirect web-scraping dotnet-httpclient


【解决方案1】:

类级别的对象:

HttpClientHandler Http_Handler = new HttpClientHandler();
HttpClient Http_Client = new HttpClient();
CookieContainer HttpClCookieJar = new CookieContainer();

HttpClient 设置存根:

private void HttpClient_Setup()
{
   Http_Handler.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
   Http_Handler.AllowAutoRedirect = false;
   Http_Handler.CookieContainer = HttpClCookieJar;
   Http_Handler.UseCookies = true;
   Http_Handler.UseDefaultCredentials = true;
   Http_Client.Timeout = new TimeSpan(30000);
   Http_Client = new HttpClient(Http_Handler);
   Http_Client.DefaultRequestHeaders.Add("User-Agent", @"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0");
   Http_Client.DefaultRequestHeaders.Add("Accept-Language", "it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3");
   Http_Client.DefaultRequestHeaders.Add("Accept", "*/*");
   Http_Client.DefaultRequestHeaders.Add("Cache-Control", "no-cache");
}

异步 ​​HttpClient 请求:

public async Task<string> HttpClient_Request(string RequestURL)
{
   string _responseHtml = string.Empty;
   ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3 | 
                                          SecurityProtocolType.Tls11 | 
                                          SecurityProtocolType.Tls12;
   try
   {
      using (HttpRequestMessage _requestMsg = new HttpRequestMessage())
      {
         _requestMsg.Method = HttpMethod.Get;
         _requestMsg.RequestUri = new Uri(RequestURL);

         using (HttpResponseMessage _response = await Http_Client.SendAsync(_requestMsg))
         {
            using (HttpContent _content = _response.Content)
            {
               _responseHtml = await _content.ReadAsStringAsync();
            };
         };
      };
   }
   catch (HttpRequestException eW)
   {
      Console.WriteLine("Message: {}  Source: {1}", eW.Message, eW.Source);
   }
   catch (Exception eX)
   {
      Console.WriteLine("Message: {}  Source: {1}", eX.Message, eX.Source);
   }
   return _responseHtml;
}

【讨论】:

    猜你喜欢
    • 2019-04-16
    • 2017-02-19
    • 1970-01-01
    • 2013-05-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多