【问题标题】:Trying to get authentication cookie(s) using HttpWebRequest尝试使用 HttpWebRequest 获取身份验证 cookie
【发布时间】:2012-08-01 15:15:56
【问题描述】:

我必须从安全站点上抓取一个表,但我无法登录到该页面并检索身份验证令牌和任何其他相关 cookie。我在这里做错了吗?

public NameValueCollection LoginToDatrose()
{
    var loginUriBuilder = new UriBuilder();
    loginUriBuilder.Host = DatroseHostName;
    loginUriBuilder.Path = BuildURIPath(DatroseBasePath, LOGIN_PAGE);
    loginUriBuilder.Scheme = "https";

    var boundary = Guid.NewGuid().ToString();
    var postData = new NameValueCollection();
    postData.Add("LoginName", DatroseUserName);
    postData.Add("Password", DatrosePassword);

    var data = Encoding.ASCII.GetBytes(postData.ToQueryString(false));
    var request = WebRequest.Create(loginUriBuilder.Uri) as HttpWebRequest;
    request.Method = "POST";
    request.ContentType = "application/x-www-form-urlencoded";
    request.ContentLength = data.Length;
    using (var d = request.GetRequestStream())
    {
        d.Write(data, 0, data.Length);
    }

    var response = request.GetResponse() as HttpWebResponse;
    var responseCookies = new NameValueCollection();
    foreach (var nvp in response.Cookies.OfType<Cookie>())
    {
        responseCookies.Add(nvp.Name, nvp.Value);
    }

    //using (var responseData = response.GetResponseStream())
    //using (var responseReader = new StreamReader(responseData))
    //{
    //    var theResponse = responseReader.ReadToEnd();
    //    Debug.WriteLine(theResponse);
    //}

    return responseCookies;

}

我在返回对象中没有得到任何值。它不会失败。 theResponse 的值(未注释掉时)似乎是登录页面的 HTML。

任何帮助将不胜感激。

【问题讨论】:

    标签: c# httpwebrequest screen-scraping webclient httpwebresponse


    【解决方案1】:

    好的,这里的问题似乎与凭据通过后发生的 302 重定向有关。 HttpWebRequest 将自动跟随 302。

    最终,我做事的方式有所不同。首先,我将WebClient 类子类化如下:

    public class CookiesAwareWebClient : WebClient
    {
        private CookieContainer outboundCookies = new CookieContainer();
        private CookieCollection inboundCookies = new CookieCollection();
    
        public CookieContainer OutboundCookies
        {
            get
            {
                return outboundCookies;
            }
        }
        public CookieCollection InboundCookies
        {
            get
            { 
                return inboundCookies; 
            }
        }
    
        public bool IgnoreRedirects { get; set; }
    
        protected override WebRequest GetWebRequest(Uri address)
        {
            WebRequest request = base.GetWebRequest(address);
            if (request is HttpWebRequest)
            {
                (request as HttpWebRequest).CookieContainer = outboundCookies;
                (request as HttpWebRequest).AllowAutoRedirect = !IgnoreRedirects;
            }
            return request;
        }
    
        protected override WebResponse GetWebResponse(WebRequest request)
        {
            WebResponse response = base.GetWebResponse(request);
            if (response is HttpWebResponse)
            {
                inboundCookies = (response as HttpWebResponse).Cookies ?? inboundCookies;
            }
            return response;
        }
    }
    

    这让我可以使用WebClient 类,它可以识别 cookie,并且我可以控制重定向。然后我重写了我的登录代码如下:

    public NameValueCollection LoginToDatrose()
    {
        var loginUriBuilder = new UriBuilder();
        loginUriBuilder.Host = DatroseHostName;
        loginUriBuilder.Path = BuildURIPath(DatroseBasePath, LOGIN_PAGE);
        loginUriBuilder.Scheme = "https";
    
        var postData = new NameValueCollection();
        postData.Add("LoginName", DatroseUserName);
        postData.Add("Password", DatrosePassword);
    
        var responseCookies = new NameValueCollection();
    
        using (var client = new CookiesAwareWebClient())
        {
            client.IgnoreRedirects = true;
            var clientResponse = client.UploadValues(loginUriBuilder.Uri, "POST", postData);
            foreach (var nvp in client.InboundCookies.OfType<Cookie>())
            {
                responseCookies.Add(nvp.Name, nvp.Value);
            }
        }
    
        return responseCookies;
    }
    

    ...一切顺利。

    【讨论】:

    • 您可以使用具有 allowredirect 属性的 httpwebrequest。你可以设置为假;
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2011-05-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-03-13
    • 2015-07-08
    • 1970-01-01
    相关资源
    最近更新 更多