Java 中的 XML XPath 解析答案

【问题标题】：XML XPath Parsing in JavaJava 中的 XML XPath 解析
【发布时间】：2013-03-28 17:07:17
【问题描述】：

这是在 Java 中使用 XPath 解析 XML 的以下标准代码。我无法调试为什么我得到空值。我附上了java文件、xml文件和输出。如果有人能解释一下我哪里出错了，我将不胜感激。提前致谢！ :)

XPathParser.java

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

public class XPathParser {
    public static void main(String args[]) throws Exception {
        //loading the XML document from a file
        DocumentBuilderFactory builderfactory = DocumentBuilderFactory.newInstance();
        builderfactory.setNamespaceAware(true);

        //XML read
        DocumentBuilder builder = builderfactory.newDocumentBuilder();
        Document xmlDocument = builder.parse("Stocks.xml");

        // Creates a XPath factory
        XPathFactory factory = javax.xml.xpath.XPathFactory.newInstance();

        //Creates a XPath Object
        XPath xPath = factory.newXPath();

        //Compiles the XPath expression
        //XPathExpression xPathExpression_count = xPath.compile("count(//stock)");
        XPathExpression xPathExpression = xPath.compile("//stock");

        //Run the query and get a nodeset
        Object result = xPathExpression.evaluate(xmlDocument,XPathConstants.NODESET);

        //Cast the result into a DOM nodelist
        NodeList nodes = (NodeList) result;
        System.out.println(nodes.getLength());
        System.out.println(nodes.item(0));
        for (int i=0; i<nodes.getLength();i++){
          System.out.println(nodes.item(i).getNodeValue());
        }
    }
}

Stocks.xml

<?xml version="1.0" encoding="UTF-8"?>
<stocks>
       <stock>
              <symbol>ABC</symbol>
              <price>10</price>
              <quantity>50</quantity>
       </stock>
       <stock>
              <symbol>XYZ</symbol>
              <price>20</price>
              <quantity>1000</quantity>
       </stock>
</stocks>

输出：

2
[stock: null]
null
null

【问题讨论】：

您期待什么？您的 XPath 选择所有 stock 节点，然后您在它们上调用 getNodeValue，它返回 null。你想得到什么？
尝试打印nodes.item(i).getTextContent() 而不是nodes.item(i).getNodeValue()。或者将您的 XPath 表达式更改为 //stock/text()

标签： java xpath xml-parsing

【解决方案1】：

您正在尝试在 Stock 节点上调用 getNodeValue 方法 - 这没有意义，因为它们没有值，它们是父节点。

你可以遍历Stock的子节点并查找信息：

final Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
final XPathExpression expression = XPathFactory.newInstance().newXPath().compile("//stock");
final NodeList nodeList = (NodeList) expression.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); ++i) {
    final NodeList childList = ((Element) nodeList.item(i)).getChildNodes();
    for (int j = 0; j < childList.getLength(); ++j) {
        final Node node = childList.item(j);
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            System.out.println(node.getNodeName() + "=" + node.getTextContent());
        }
    }
}

输出：

symbol=ABC
price=10
quantity=50
symbol=XYZ
price=20
quantity=1000

请注意，您必须按类型过滤子Nodes，否则您将循环遍历子节点和节点文本值的组合，该文本值作为文本节点出现。这是以这种方式遍历 XML 的常见问题。

你也可以遍历Stock的所有子文本节点：

final Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
final XPathExpression expression = XPathFactory.newInstance().newXPath().compile("//stock/*/text()");
final NodeList nodeList = (NodeList) expression.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); ++i) {
    final Node node = nodeList.item(i);
    System.out.println(node.getNodeValue());
}

输出：

ABC
10
50
XYZ
20
1000

在这种情况下，您将遍历 Stock 的子代的所有文本节点 - 这意味着您丢失了节点名称的信息。但是您可以通过循环遍历所有不是文本节点的 Stock 子节点来重新创建第一种方法：

final Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
final XPathExpression expression = XPathFactory.newInstance().newXPath().compile("//stock/*");
final NodeList nodeList = (NodeList) expression.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); ++i) {
    final Node node = nodeList.item(i);
    System.out.println(node.getNodeName() + "=" + node.getTextContent());
}

输出：

symbol=ABC
price=10
quantity=50
symbol=XYZ
price=20
quantity=1000

如果您想要更具体的内容，也可以在 XPath 中选择一个特定的子节点：

final Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
final XPathExpression expression = XPathFactory.newInstance().newXPath().compile("//stock/symbol/text()");
final NodeList nodeList = (NodeList) expression.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); ++i) {
    final Node node = nodeList.item(i);
    System.out.println(node.getNodeValue());
}

输出：

ABC
XYZ

【讨论】：