仅显示来自 java 中 xml 的特定元素的值答案

【问题标题】：Only show values from specific elements for xml in java仅显示来自 java 中 xml 的特定元素的值
【发布时间】：2021-10-13 07:56:40
【问题描述】：

我正在尝试使用 Java 解析以下 XML：

<catalog>
    <book id="bk101">
        <author>Gambardella, Matthew</author>
        <title>XML Developer's Guide</title>
        <genre>Computer</genre>
        <price>44.95</price>
        <publish_date>2000-10-01</publish_date>
    </book>
    <book id="bk109">
        <author>Kress, Peter</author>
        <title>Paradox Lost</title>
        <genre>Science Fiction</genre>
        <price>6.95</price>
        <publish_date>2006-11-02</publish_date>
    </book>
    <book id="bk110">
        <author>O'Brien, Tim</author>
        <title>Microsoft .NET: The Programming Bible</title>
        <genre>Computer</genre>
        <price>36.95</price>
        <publish_date>2006-12-09</publish_date>
    </book>
    <book id="bk112">
        <author>Galos, Mike</author>
        <title>Visual Studio 7: A Comprehensive Guide</title>
        <genre>Computer</genre>
        <price>49.95</price>
        <publish_date>2008-04-16</publish_date>
    </book>
</catalog>

但我需要显示所有价格大于 10 且在 2005 年之后出版的书籍。我有类似的东西：

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();     
DocumentBuilder builder = factory.newDocumentBuilder()
Document document = builder.parse(new File("books.xml"));
document.getDocumentElement().normalize();
NodeList bookList = document.getElementsByTagName("book");
for(int i = 0; i <bookList.getLength(); i++) {
    Node book1 = bookList.item(i);
    if(book1.getNodeType() == Node.ELEMENT_NODE) {
        Element bookElement = (Element) book1;
        System.out.println("Book " +bookElement.getAttribute("id"));
        System.out.println("Author : " +bookElement.getElementsByTagName("author").item(0).getTextContent());
        //...
    }
}

【问题讨论】：

我建议查看 XPath 或 XQuery 或 XSLT，路径为 /catalog/book[price > 10 and number(substring(publish_date, 1, 4)) > 2005]。

标签： java xml parsing

【解决方案1】：

您可以逐步尝试：

1 - 我以xml 为例：

String source =
        "<?xml version=\"1.0\"?>" +
        "<catalog>\n" +
        "    <book id=\"bk101\">\n" +
        "        <author>Gambardella, Matthew</author>\n" +
        "        <title>XML Developer's Guide</title>\n" +
        "        <genre>Computer</genre>\n" +
        "        <price>44.95</price>\n" +
        "        <publish_date>2000-10-01</publish_date>\n" +
        "    </book>\n" +
        "    <book id=\"bk109\">\n" +
        "        <author>Kress, Peter</author>\n" +
        "        <title>Paradox Lost</title>\n" +
        "        <genre>Science Fiction</genre>\n" +
        "        <price>6.95</price>\n" +
        "        <publish_date>2006-11-02</publish_date>\n" +
        "    </book>\n" +
        "    <book id=\"bk110\">\n" +
        "        <author>O'Brien, Tim</author>\n" +
        "        <title>Microsoft .NET: The Programming Bible</title>\n" +
        "        <genre>Computer</genre>\n" +
        "        <price>36.95</price>\n" +
        "        <publish_date>2006-12-09</publish_date>\n" +
        "    </book>\n" +
        "    <book id=\"bk112\">\n" +
        "        <author>Galos, Mike</author>\n" +
        "        <title>Visual Studio 7: A Comprehensive Guide</title>\n" +
        "        <genre>Computer</genre>\n" +
        "        <price>49.95</price>\n" +
        "        <publish_date>2008-04-16</publish_date>\n" +
        "    </book>\n" +
        "</catalog>";

2- 我们正在将xml 转换为文档：

注意：您可能正在读取文件，然后您可以使用documentBuilder.parse(new File("filename.xml")) 代码。

DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
InputSource inputSource = new InputSource(new StringReader(source));
Document document = documentBuilder.parse(inputSource);

3 - 我们添加 Xpath 表达式来搜索 xml 文档：

注意：以下是使用@Martin Honnen 的表达式的方法。

XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
String xpathExpression = "//catalog//book[price > 10 and number(substring(publish_date, 1, 4)) > 2005]";
XPathExpression xPathExpression = xpath.compile(xpathExpression);
NodeList nodes = (NodeList) xPathExpression.evaluate(document, XPathConstants.NODESET);

4 - 我们从我们过滤的书籍中提取我想要的信息：

注意：遍历所有子节点和节点。item(i).getNodeType() == Node.ELEMENT_NODE 用于过滤掉文本节点。如果 XML 中没有其他内容，那么剩下的就是人员节点。

for (int i = 0; i < nodes.getLength(); i++) {
    if (nodes.item(i).getNodeType() == Node.ELEMENT_NODE) {
        Element element = (Element) nodes.item(i);
        String author = element.getElementsByTagName("author")
                .item(0).getTextContent();
        String title = element.getElementsByTagName("title")
                .item(0).getTextContent();
        String genre = element.getElementsByTagName("genre")
                .item(0).getTextContent();
        String price = element.getElementsByTagName("price")
                .item(0).getTextContent();
        String publish_date = element.getElementsByTagName("publish_date")
                .item(0).getTextContent();
        System.out.println(String.format(
                "[author=%s, title=%s, genre=%s, price=%s, publish_date=%s]",
                author, title, genre, price, publish_date));
    }
}

5 - 输出将是这样的：

[author=O'Brien, Tim, title=Microsoft .NET: The Programming Bible, genre=Computer, price=36.95, publish_date=2006-12-09]
[author=Galos, Mike, title=Visual Studio 7: A Comprehensive Guide, genre=Computer, price=49.95, publish_date=2008-04-16]

Process finished with exit code 0

工作代码在这里https://ideone.com/mLPwrf

【讨论】：

【解决方案2】：

您可以从 Java 调用 XSLT 处理器，可以是内置的 XSLT 1.0 处理器，也可以是提供 XSLT 3.0 的 Saxon。在 XSLT 3.0 中，执行此操作的样式表是（未经测试）：

<xsl:transform 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  version="3.0" expand-text="yes">
  <xsl:output method="text"/>
  <xsl:mode on-no-match="shallow-skip"/>
  <xsl:variable name="NL" select="'&#xa;'"/>

  <xsl:template match="/">
    <xsl:apply-templates 
         select="//book[price>10 and year-from-date(xs:date(publish_date)>2005]"/>
  </xsl:template>

  <xsl:template match="Book">
    <xsl:text>Book: {@id}{$NL}</xsl:text>
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="Title">  Title: {.}{$NL}</xsl:template>

  <xsl:template match="Author">  Author: {.}{$NL}</xsl:template>

</xsl:transform>

【讨论】：