命名空间声明的 XPath 解析答案

【问题标题】：XPath Parsing of Namespace Declarations命名空间声明的 XPath 解析
【发布时间】：2012-09-08 17:15:40
【问题描述】：

我使用 XPath 仅提取 URL128 XML 元素的值。即使我在下面的示例中只有一个，也可能有很多。当我在 SearchResponse 元素上包含 xmlns='http://c1.net.corbis.com/' 时，我得到一个空的 NodeList，但是当我删除该命名空间元素时它工作正常。有没有我遗漏的配置？

String xmlData = "<SearchResponse xmlns='http://c1.net.corbis.com/'><searchResultDataXML><SearchResultData><SearchRequestUID Scope='Public' Type='Guid' Value='{cded773c-c4b7-4dd8-aaee-8e5b8b7a2475}'/><StartPosition Scope='Public' Type='Long' Value='1'/><EndPosition Scope='Public' Type='Long' Value='50'/><TotalHits Scope='Public' Type='Long' Value='323636'/></SearchResultData></searchResultDataXML><imagesXML><Images><Image><ImageUID Scope='Public' Type='Guid' Value='{a6f6d3e2-2c3f-4502-9741-eae2e1bb573a}'/><CorbisID Scope='Public' Type='String' Value='42-25763849'/><Title Scope='Public' Type='String' Value='Animals figurines'/><CreditLine Scope='Public' Type='String' Value='¬© Ocean/Corbis'/><IsRoyaltyFree Scope='Public' Type='Boolean' Value='True'/><AspectRatio Scope='Public' Type='String' Value='0.666667'/><URL128 Scope='Public' Type='String' Value='http://cachens.corbis.com/CorbisImage/thumb/25/76/38/25763849/42-25763849.jpg'/></Image></Images></imagesXML></SearchResponse>";
            InputSource source = new InputSource(new StringReader(xmlData));

            XPath xPath = XPathFactory.newInstance().newXPath();
            NodeList list = null;
            try {
                list = (NodeList) xPath.evaluate("//URL128/@Value", source, XPathConstants.NODESET);
            } catch (Exception ex) {
                System.out.println(ex.getMessage());
            }
            for (int i = 0; i < list.getLength(); i++) {
                System.out.println(list.item(i).getTextContent());
            }

【问题讨论】：

标签： java xpath

【解决方案1】：

好吧，长话短说，您需要向您的XPath 提供NamespaceContext：

final XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new NamespaceContext() {
    @Override
    public Iterator<String> getPrefixes(final String namespaceURI) {
        return null;
    }
    @Override
    public String getPrefix(final String namespaceURI) {
        return null;
    }
    @Override
    public String getNamespaceURI(final String prefix) {
        return "http://c1.net.corbis.com/";
    }
});
final NodeList list = (NodeList) xPath.evaluate("//c:URL128/@Value", source, XPathConstants.NODESET);
for (int i = 0; i < list.getLength(); i++) {
    System.out.println(list.item(i).getTextContent());
}

在这种情况下，XPath 要求我们实现的唯一方法似乎是getNamespaceURI(String prefix)。

请注意，在这种情况下，“c:URL128”中的实际前缀并不重要——您可以很容易地使用“:URL128”。当您确实在您的 XML 中有多个命名空间时，区分它们就变得很重要（如果元素相对较少，则使用 Map 或一系列 if-then-else）。

如果您不能或不想对前缀进行硬编码，您可以自己从 XML 文档中提取它们，但这需要更多代码...

有关更多详细信息，另请参阅this blog post。

【讨论】：

【解决方案2】：

对此有一个稍微简单的解决方案，它不涉及在代码中放置硬编码的 URI 引用...只需解析文档并将命名空间感知属性设置为 false...

String xmlData = "<SearchResponse xmlns='http://c1.net.corbis.com/'><searchResultDataXML><SearchResultData><SearchRequestUID Scope='Public' Type='Guid' Value='{cded773c-c4b7-4dd8-aaee-8e5b8b7a2475}'/><StartPosition Scope='Public' Type='Long' Value='1'/><EndPosition Scope='Public' Type='Long' Value='50'/><TotalHits Scope='Public' Type='Long' Value='323636'/></SearchResultData></searchResultDataXML><imagesXML><Images><Image><ImageUID Scope='Public' Type='Guid' Value='{a6f6d3e2-2c3f-4502-9741-eae2e1bb573a}'/><CorbisID Scope='Public' Type='String' Value='42-25763849'/><Title Scope='Public' Type='String' Value='Animals figurines'/><CreditLine Scope='Public' Type='String' Value='¬© Ocean/Corbis'/><IsRoyaltyFree Scope='Public' Type='Boolean' Value='True'/><AspectRatio Scope='Public' Type='String' Value='0.666667'/><URL128 Scope='Public' Type='String' Value='http://cachens.corbis.com/CorbisImage/thumb/25/76/38/25763849/42-25763849.jpg'/></Image></Images></imagesXML></SearchResponse>";
InputSource source = new InputSource(new StringReader(xmlData));

// create doc instance instead of passing source straight to XPath...
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false); // must be false
DocumentBuilder builder = factory.newDocumentBuilder();
final Document doc = builder.parse(source);

XPath xPath = XPathFactory.newInstance().newXPath();

// use doc instead
NodeList list = (NodeList) xPath.evaluate("//URL128/@Value", doc, 
        XPathConstants.NODESET);

for (int i = 0; i < list.getLength(); i++) {
    System.out.println(list.item(i).getTextContent());
}

【讨论】：

【解决方案3】：

以下是实现 AlistairIsreal 概述的两种方法：

如果使用 spring 你可以依赖 org.springframework.util.xml.SimpleNamespaceContext 接口。

InputSource source = new InputSource(new StringReader(unescaped));

            XPath xPath = XPathFactory.newInstance().newXPath();
            NodeList list = null;
            try
            {
                SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
                nsCtx.bindNamespaceUri("ns", "http://c1.net.corbis.com/");
                xPath.setNamespaceContext(nsCtx);
                list = (NodeList) xPath.evaluate("//ns:URL128/@Value", source, XPathConstants.NODESET);
            } catch (Exception ex)
            {
                System.out.println(ex.getMessage());
            }
            for (int i = 0; i < list.getLength(); i++)
            {
                System.out.println(list.item(i).getTextContent());
            }

【讨论】：