Android：如何将 HTML Table 解析为 ListView？答案

【问题标题】：Android: How to parse HTML Table into ListView?Android：如何将 HTML Table 解析为 ListView？
【发布时间】：2013-06-04 04:53:09
【问题描述】：

您好，我想将 HTML 表解析为 Android ListView，但我不知道从哪里开始。该表有很多信息。有人可以帮我从这个开始吗？

提前致谢！

HTML 表格：http://intranet.staring.nl/toepassingen/rooster/lochem/2W2/2012090320120909/2W01533.htm（点击查看源代码）。

【问题讨论】：

标签： java android html android-listview html-table

【解决方案1】：

您首先需要将 HTML 表解析为数据结构，然后使用 ListView 显示该信息。尝试使用 JSoup 库进行 HTML 解析：http://jsoup.org/cookbook/introduction/parsing-a-document

【讨论】：

@davidbuzatto 根据他们的下载页面，它适用于 Android：jsoup.org/download

【解决方案2】：

我不知道你是否已经在这里得到了答案，但我对你建议的链接做了同样的事情，我会在这里发布我的代码，但它仍然很混乱，不要申请最新的时间表（第 9 小时)

我正在使用 HTML Cleaner 库来解析 html：

try {   
        HtmlCleaner hc = new HtmlCleaner();
        CleanerProperties cp = hc.getProperties();
        cp.setAllowHtmlInsideAttributes(true);
        cp.setAllowMultiWordAttributes(true);
        cp.setRecognizeUnicodeChars(true);
        cp.setOmitComments(true);

        String loc = sp.getString( Constants.pref_locatie      , "" );
        String per = sp.getString( Constants.pref_persoon      , "" );
        String oob = sp.getString( Constants.pref_onderofboven , "" );

        int counteruurmax;
        int[] pauze;
        if (oob.contains("onder")){
            pauze = Constants.pauzeo;
        } else if (oob.contains("boven")) {
            pauze = Constants.pauzeb;
        } else {
            return false;
        }

        String url = "";
        if (loc.contains("lochem")) {
            url += Constants.RoosterLochem;
            url += t.getDatum();
            url += "/";
            url += per;
            counteruurmax = 11;
        } else if (loc.contains("herenlaan")) {
            url += Constants.RoosterHerenlaan;
            url += per;
            counteruurmax = 13;
        } else if (loc.contains("beukenlaan")) {
            url += Constants.RoosterBeukenlaan;
            url += per;
            counteruurmax = 11;
        } else {
            return false;
        }

        String htmlcode = t.getHtml(url);
        TagNode html = hc.clean(htmlcode);
        Document doc = new DomSerializer(cp, true).createDOM(html);
        XPath xp = XPathFactory.newInstance().newXPath();
        NodeList nl = (NodeList) xp.evaluate(Constants.XPathRooster, doc, XPathConstants.NODESET);

        int counteruur = 1;
        int counterdag = 1;
        int decreaser  = 0;
        Boolean isPauze = false;
        RoosterItems RItems = new RoosterItems();
        RoosterItem  RItem  = null;
        for (int i = 0; i < nl.getLength(); i++){

            if ((counteruur == pauze[0]) || (counteruur == pauze[1]) || (counteruur == pauze[2])) {
                isPauze = true;
                decreaser++;
            }

            if (!isPauze) {
                RItem = new RoosterItem();
                switch (counterdag){
                case 1:
                    RItem.setDag("ma");
                    break;
                case 2:
                    RItem.setDag("di");
                    break;
                case 3:
                    RItem.setDag("wo");
                    break;
                case 4:
                    RItem.setDag("do");
                    break;
                case 5:
                    RItem.setDag("vr");
                    break;
                }

                Node n = nl.item(i);
                String content = n.getTextContent();
                if (content.length() > 1) {
                    RItem.setUur(""+(counteruur-decreaser));
                    NodeList t1 = n.getChildNodes();
                    NodeList t2 = t1.item(0).getChildNodes();
                    NodeList t3 = t2.item(0).getChildNodes();
                    for (int j = 0; j < t3.getLength(); j++) {
                        Node temp = t3.item(j);
                        if (t3.getLength() == 3) {
                            switch (j) {
                            case 0:
                                RItem.setLes(""+temp.getTextContent());
                                break;
                            case 1:
                                RItem.setLokaal(""+temp.getTextContent());
                                break;
                            case 2:
                                RItem.setDocent(""+temp.getTextContent());
                                break;
                            default:
                                return false;
                            }
                        } else if (t3.getLength() == 4) {
                            switch (j) {
                            case 0:
                                break;
                            case 1:
                                RItem.setLes("tts. " + temp.getTextContent());
                                break;
                            case 2:
                                RItem.setLokaal(""+temp.getTextContent());
                                break;
                            case 3:
                                RItem.setDocent(""+temp.getTextContent());
                                break;
                            default:
                                return false;
                            }
                        } else if (t3.getLength() == 1) {
                            RItem.setLes(""+temp.getTextContent());
                        } else {
                            return false;
                        }
                    }
                } else {
                    RItem.setUur("" + (counteruur-decreaser));
                    RItem.setLokaal("Vrij");
                }
                RItems.add(RItem);
            }
            if (counteruur == counteruurmax) { counteruur = 0; counterdag++; decreaser = 0;}
            counteruur++;
            isPauze = false;
        }

        if (RItems.size() > 0) {
            mSQL = new RoosterSQLAdapter(mContext);
            mSQL.openToWrite();
            mSQL.deleteAll();
            for (int j = 0; j < RItems.size(); j++) {
                RoosterItem insert = RItems.get(j);
                mSQL.insert(insert.getDag(), insert.getUur(), insert.getLes(), insert.getLokaal(), insert.getDocent());
            }
            if (mSQL != null) mSQL.close();
        }
        return true;
    } catch (ParserConfigurationException e) {
        e.printStackTrace();
        return false;
    } catch (XPathExpressionException e) {
        e.printStackTrace();
        return false;
    }

有一些常数，但我认为您可以自己猜到它们；）否则您知道如何向我询问它们：）

RoosterItem 类将保存一小时的所有变量，而 RoosterItems 将保存多个 RoosterItem

祝你好运！

【讨论】：

抱歉没有添加 XPath，这里是："/html/body/table[1]/tbody/tr/td" 请注意，这仅适用于使用 XPath 的 API
谢谢，我已经找到答案了。但是由于上面的代码示例很好，我会将其标记为最佳答案。（是的，所以你得到了积分；））

【解决方案3】：

到目前为止，我认为 JSoup 是提取或操作 HTML 的最佳方法之一.....

查看此链接：

http://jsoup.org/

但不知何故....这在我的情况下不起作用，所以我将整个 HTML 代码转换为字符串，然后解析它.....

【讨论】：