【问题标题】:Fetch a particular child node using SAX Parser使用 SAX Parser 获取特定的子节点
【发布时间】:2020-03-08 02:02:14
【问题描述】:

强调文本我有以下 xml:

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
    <title>Game Analysis</title>
    <item>
        <title>Game</title>
        <description>ABC</description>
        <releaseDate>Sat, 21 Feb 2012 05:18:23 GMT</releaseDate>       
    </item>
    <item>
        <title>CoD</title>
        <description>XYZ</description>
        <releaseDate>Sat, 21 Feb 2011 05:18:23 GMT</releaseDate>            
    </item>
</channel>
</rss>

我必须解析这个 xml 并获取“item”下的所有 childNode,然后检查它是否包含“releaseDate”节点。如果不是,那么我必须抛出一个异常。

我也尝试过使用 xpath,但它不起作用。

    XPathFactory xPathfactory = XPathFactory.newInstance();
    XPath xpath = xPathfactory.newXPath();
    XPathExpression expr = xpath.compile("//channel/item");

    Object result = expr.evaluate(document, XPathConstants.NODESET);
    NodeList nodes = (NodeList) result;
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i).getChildNodes());
    }

【问题讨论】:

  • 这里似乎有些混乱。 SAX 解析器不创建节点树;它们为应用程序提供一系列事件。您不能直接将 XPath 与 SAX 一起使用。您可以使用 SAX 解析器为使用 DOM、JDOM2 或 XOM 的树构建器提供输入,然后在生成的树上使用 XPath。
  • 这听起来像是一个学生练习,学生练习通常要求您不仅要解决问题,还要使用一组特定的技术来解决问题。如果是这种情况,那么您需要清楚地告诉我们您对解决方案施加了哪些限制。

标签: java xml xml-parsing sax saxparser


【解决方案1】:

试试这个代码。 不要忘记在您的项目中包含 SAX 解析器库并从 XML 文档中删除 rss-string(希望这被接受)。

public class SaxParserTest {
    public static void main(String... argv) {
        SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
        try {
            SAXParser saxParser = saxParserFactory.newSAXParser();
            MyHandler handler = new MyHandler();
            saxParser.parse(new File("your path to XML-file here"), handler);
            List<Item> items = handler.getChannel().getItems();
            // your check of item release dates here
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

class MyHandler extends DefaultHandler {
    private StringBuilder data = new StringBuilder();

    private Channel channel;

    private String itemTitle;
    private String itemDescription;
    private String itemReleaseDate;

    private boolean isItem;

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if (!qName.equals("rss")) {
            if (qName.equalsIgnoreCase("channel")) {
                channel = new Channel();
            } else if (qName.equalsIgnoreCase("item")) {
                isItem = true;
            }
            data.setLength(0);
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if (qName.equalsIgnoreCase("title")) {
            if (!isItem) {
                channel.setTitle(data.toString());
            } else {
                itemTitle = data.toString();
            }
        } else if (qName.equalsIgnoreCase("item")) {
            channel.addItem(new Item(itemTitle, itemDescription, itemReleaseDate));
            itemTitle = null;
            itemDescription = null;
            itemReleaseDate = null;
            isItem = false;
        } else if (qName.equalsIgnoreCase("description")) {
            itemDescription = data.toString();
        } else if (qName.equalsIgnoreCase("releaseDate")) {
            itemReleaseDate = data.toString();
        }
    }

    @Override
    public void characters(char ch[], int start, int length) throws SAXException {
        data.append(new String(ch, start, length));
    }

    public Channel getChannel() {
        return channel;
    }
}

class Channel {
    private String title;
    private List<Item> items;

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public List<Item> getItems() {
        return items;
    }

    public void setItems(List<Item> items) {
        this.items = items;
    }

    public void addItem(Item item) {
        if (items == null) {
            items = new ArrayList<Item>();
        }
        items.add(item);
    }
}

class Item {
    private String title;
    private String description;
    private String releaseDate;

    public Item(String title, String description, String releaseDate) {
        this.title = title;
        this.description = description;
        this.releaseDate = releaseDate;
    }
    public String getReleaseDate() {
        return releaseDate;
    }
}

【讨论】:

    【解决方案2】:

    XPath 应该可以正常工作,甚至可以用来创建更短的解决方案。表达式//channel/item[not(releaseDate)] 将返回所有releaseDate 子节点的item 节点。所以这段代码应该会给你答案:

        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setNamespaceAware(true);
    
        Document document = dbf
                .newDocumentBuilder()
                .parse(...);
    
        XPath xpath = XPathFactory
                .newInstance()
                .newXPath();
    
        NodeList list = (NodeList) xpath.evaluate("//channel/item[not(releaseDate)]", document, XPathConstants.NODESET);
        if (list.getLength() != 0) {
            throw new Exception("Found <item> without <releaseDate>");
        }
    

    【讨论】:

    • 是的,但这不使用 SAX。
    猜你喜欢
    • 2023-03-10
    • 2012-05-01
    • 1970-01-01
    • 2015-11-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多