【发布时间】:2021-07-29 06:29:19
【问题描述】:
我正在使用 JSoup,这是我的代码:
public class ClassOLX {
public static final String URL = "https://www.olx.com.pe/item/nuevo-nissan-march-autoland-iid-1103776672";
public static void main (String args[]) throws IOException {
if (getStatusConnectionCode(URL) == 200) {
Document document = getHtmlDocument(URL);
String model = document.select(".rui-2CYS9").select(".itemPrice").text();
System.out.println("Model: "+model);
}else
System.out.println(getStatusConnectionCode(URL));
}
public static int getStatusConnectionCode(String url) {
Response response = null;
try {
response = Jsoup.connect(url).userAgent("Mozilla/5.0").timeout(100000).ignoreHttpErrors(true).execute();
} catch (IOException ex) {
System.out.println(ex.getMessage());
}
return response.statusCode();
}
public static Document getHtmlDocument(String url) {
Document doc = null;
try {
doc = Jsoup.connect(url).userAgent("Mozilla/5.0").timeout(100000).get();
} catch (IOException ex) {
System.out.println(ex.getMessage());
}
return doc;
}
}
这是页面:
我想获取以下元素的值:itemPrice,_18gRm,itemTitle,_2FRXm
谢谢大家。
【问题讨论】:
标签: java web-scraping jsoup