【发布时间】:2019-12-10 00:34:29
【问题描述】:
我不能在这个项目中使用 google api,但需要做一个简单的 google 查询,我通过在 WebClient 上使用 ssl3 和 tls12 来做到这一点,手动设置标题(不确定这是否有帮助)并简单地发送一个 GET 请求,由于某种原因,这需要 10 秒,但 StackOverflow 只需 3 秒。然而,当使用 chrome 都立即加载时,使用 WebClient 的瓶颈是什么?如何像 chrome 一样快速获取 SSL GET 请求?
第二个问题:如果页面包含 JS,如何在不使用网络浏览器渲染整个内容的情况下在检索到的“文档”上执行 js p>
任何帮助表示赞赏。
编辑:删除标题修改代码会加快速度,但谷歌仍然非常慢,我假设他们是故意这样做的?有没有办法解决这个问题?
//in main
WebCrawler wc = new WebCrawler();
string page = wc.load("https://stackoverflow.com/questions/20064505/requesting-html-over-https-with-c-sharp-webclient");
page = wc.load("https://www.google.com/maps?q=computer+shops+near+me&rlz=1C1GCEA_enZA855ZA855&um=1&ie=UTF-8&sa=X&ved=0ahUKEwi1lY-c4eDjAhUtWhUIHf8DDKUQ_AUIEigB");
...
// webcrawler class
WebClient webClient;
public WebCrawler()
{
webClient = new WebClient();
ServicePointManager.ServerCertificateValidationCallback += ValidateRemoteCertificate;
ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3;
ServicePointManager.Expect100Continue = true;
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
}
public string load(string uri)
{
Uri address = new Uri(uri);
{
webClient.Headers.Set(HttpRequestHeader.UserAgent, "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36");
webClient.Headers.Set(HttpRequestHeader.Referer, "https://www.google.com/");
// webClient.Headers.Set(HttpRequestHeader.Cookie,
var stream = webClient.OpenRead(address);
using (StreamReader sr = new StreamReader(stream))
{
var page = sr.ReadToEnd();
return page;
}
}
}
private static bool ValidateRemoteCertificate(object sender, X509Certificate cert, X509Chain chain, SslPolicyErrors error)
{
if (error == System.Net.Security.SslPolicyErrors.None)
{
return true;
}
Console.WriteLine("X509Certificate [{0}] Policy Error: '{1}'",
cert.Subject,
error.ToString());
return false;
}
}
【问题讨论】:
标签: c# google-maps https webclient