【发布时间】:2019-11-28 16:48:25
【问题描述】:
我需要访问网站上的许多不同页面并收集信息。我不确定如何处理 cookie。如果我使用 chrome 调试器控制台 (F12) 查看网络活动,我可以看到正在发送的请求属性和 cookie。如果我专门为其中一个页面添加了 cookie(参见注释掉的 con.setRequestProperty("Cookie", ...),则信息检索成功。
URL url = new URL(urlStr);
HttpURLConnection con = (HttpURLConnection) url.openConnection();
con.setRequestMethod("GET");
con.setRequestProperty("Host", county +"." +referer +".com");
con.setRequestProperty("Connection", "keep-alive");
con.setRequestProperty("Accept", "application/json, text/javascript, */*; q=0.01");
con.setRequestProperty("X-Requested-With", "XMLHttpRequest");
con.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36");
con.setRequestProperty("Origin", "http://evil.com/");
con.setRequestProperty("Referer", "https://" +county +"." +referer +".com/index.cfm?zaction=AUCTION&Zmethod=PREVIEW&AUCTIONDATE=" +df.format(date));
con.setRequestProperty("Accept-Language", "en-US,en;q=0.9");
//con.setRequestProperty("Cookie", "cfid=9ed9c083-4696-4712-950d-1c0ad0727883; cftoken=0; AWSELB=CF13C5A70AE16731FBD093515EF0DDB58935BEB4D69838721C70C3BED039F919AF343D891D9A2001BD1070AC4C076AA72DF0A7EA6AEED1091BCD24CC7203622E75C0DE5C92; _gcl_au=1.1.1696117075.1563489288; __utmc=119398810; __utmz=119398810.1563489288.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); CF_CLIENT_" +county.toUpperCase() +"_" +referer.toUpperCase() +"_TC=1563505029291; __utma=119398810.1711105058.1563489288.1563498837.1563505090.3; __utmt_UA-51657054-1=1; __utmb=119398810.10.10.1563505090; testcookiesenabled=disabled; CF_CLIENT_" +county.toUpperCase() +"_" +referer.toUpperCase() +"_LV=1563508162268; CF_CLIENT_" +county.toUpperCase() +"_" +referer.toUpperCase() +"_HC=221");
//handle cookies
String cookiesHeader = con.getHeaderField("Set-Cookie");
List<HttpCookie> cookies = HttpCookie.parse(cookiesHeader);
CookieManager cookieManager = new CookieManager();
cookies.forEach(cookie -> cookieManager.getCookieStore().add(null, cookie));
con.disconnect();
con = (HttpURLConnection) url.openConnection(); //create new connection with cookies
con.setRequestProperty("Cookie", StringUtils.join(cookieManager.getCookieStore().getCookies(), ";"));
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
StringBuilder stringBuilder = new StringBuilder();
while ((str = in.readLine()) != null) {
stringBuilder.append(str);
}
in.close();
con.disconnect();
但是如果使用“处理cookies”部分中的代码(来自教程https://www.baeldung.com/java-http-request),则会返回一个空数据集。有人能发现我做错了什么吗?
【问题讨论】:
-
您想从响应中读取 cookie 还是将 cookie 作为请求的一部分发送?
-
我想将它们作为请求的一部分发送。但是如何首先获得它们? //handle cookies 部分应该获取它们,断开连接,然后重新连接到位的 cookie,但它似乎不起作用,因为输出是 {"retHTML":"", "rlist":""},一个空集。
标签: java cookies request httpurlconnection