使用 Jsoup 获取所有数据网络答案

【问题标题】：get all data web with Jsoup使用 Jsoup 获取所有数据网络
【发布时间】：2019-07-22 12:28:22
【问题描述】：

我正在尝试获取所有数据网站但有错误行文档 doc1 = Jsoup.connect(url).get(); 你帮帮我吧！

public static void main(String[] args) throws IOException {
    File file = new File("out22.txt");
    FileWriter fw = new FileWriter(file);
    PrintWriter pw = new PrintWriter(fw);

    Document doc = Jsoup.connect("https://vnexpress.net/").get();
    String title = doc.title();
    System.out.println("Title : " + title);
    Elements links = doc.select("a[href]");
    for (Element link: links) {
      String url = link.attr("href");
      //System.out.println("\nLink: "+url);

      Document doc1 = Jsoup.connect(url).get();
      Elements title1 = doc1.select("h1[class=title_news_detail mb10]");
      Elements description = doc1.select("p[class=description]");
      Elements content = doc1.select("p[class=Normal]");
      String tieude = title1.text();
      String noidung = content.text();
      String mota = description.text();
      System.out.println(noidung);

      pw.println(tieude);
      pw.println("\n" + mota);
      pw.println("\n" + noidung);
      pw.close();

【问题讨论】：

尝试添加 www.例如：vnexpress.net
您在 attr href 中得到“/”，请同时添加一个 try catch 块和“/”的条件

标签： java jsoup

【解决方案1】：

您的 URL 使用请求标头进行过滤器连接。如果你想从中获取数据，你应该使用 selenium。

【讨论】：