【问题标题】:How to download/read html file via ftp url?如何通过 ftp url 下载/读取 html 文件?
【发布时间】:2026-01-14 00:20:09
【问题描述】:

我无法通过 ftp 从此 html 文件中获取 html 文本。我使用漂亮的汤通过 http/https 读取 html 文件,但由于某种原因,我无法从 ftp 下载/读取。请帮忙!

这是网址。 a link

这是我目前的代码。

BufferedReader reader = null;
String total = "";
String line;
ur = "ftp://ftp.legis.state.tx.us/bills/832/billtext/html/house_resolutions/HR00001_HR00099/HR00014I.htm"
try {
    URL url = new URL(ur);
    URLConnection urlc = url.openConnection();
    InputStream is = urlc.getInputStream(); // To download
    reader = new BufferedReader(new InputStreamReader(is, "UTF-8"));
        while ((line = reader.readLine()) != null)
            total += reader.readLine();

} finally {
    if (reader != null) 
        try { reader.close(); 
        } catch (IOException logOrIgnore) {}
}

【问题讨论】:

  • 你能发布错误的堆栈跟踪吗?

标签: java url ftp download


【解决方案1】:

这段代码对我有用,Java 1.7.0_25。请注意,您存储了每两行之一,在条件和 while 循环的主体中调用 reader.readLine()

public static void main(String[] args) throws MalformedURLException, IOException {
    BufferedReader reader = null;
    String total = "";
    String line;
    String ur = "ftp://ftp.legis.state.tx.us/bills/832/billtext/html/house_resolutions/HR00001_HR00099/HR00014I.htm";
    try {
        URL url = new URL(ur);
        URLConnection urlc = url.openConnection();
        InputStream is = urlc.getInputStream(); // To download
        reader = new BufferedReader(new InputStreamReader(is, "UTF-8"));
        while ((line = reader.readLine()) != null) {
            total += line;
        }
    } finally {
        if (reader != null) {
            try {
                reader.close();
            } catch (IOException logOrIgnore) {
            }
        }
    }
}

【讨论】:

    【解决方案2】:

    首先认为这与 discussed here 的错误路径解析有关,但这并没有帮助。

    我不知道这里到底出了什么问题,但我只能在这个 ftp 服务器和 MacOS Java 1.6.0_33-b03-424 上重现这个错误。我无法用 Java 1.7.0_25 重现它。因此,也许您检查 Java 更新。

    或者您可以使用commons FTPClient 来检索文件:

    FTPClient client = new FTPClient();
    client.connect("ftp.legis.state.tx.us");
    client.enterLocalPassiveMode();
    client.login("anonymous", "");
    client.changeWorkingDirectory("bills/832/billtext/html/house_resolutions/HR00001_HR00099");
    InputStream is = client.retrieveFileStream("HR00014I.htm");
    

    【讨论】: