【发布时间】:2016-03-14 08:39:57
【问题描述】:
我的 xml 文件如下:
<?xml version="1.0"?>
<all>
<test1>
hajarrrr rrr rr
</test1>
<catalog>
<book id="bk101">
<author>
<infos>Empire Burlesque</infos>
</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>
<infos>Emhhsjshhh</infos>
</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
</catalog>
</all>
我只想提取 <catalog> 和 </catalog> 之间的块,所以我写了这个 java 代码:
public static void main(String[] args) throws XPathExpressionException, SAXException, IOException, ParserConfigurationException, TransformerException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("C:\\Users\\HC\\Desktop\\dataset\\book.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
//TODO: Is this correct query?
XPathExpression xpathExp = xpath.compile("//text()[normalize-space(.) = '']");
NodeList emptyTextNodes = (NodeList) xpathExp.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < emptyTextNodes.getLength(); i++) {
Node emptyTextNode = emptyTextNodes.item(i);
emptyTextNode.getParentNode().removeChild(emptyTextNode);
}
XPathExpression expr = xpath.compile("/all/catalog/descendant::node()");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nl.getLength(); i++) {
Node n = nl.item(i);
System.out.println(n.getNodeName() + " " + n.getNodeValue());
}
}
我希望结果是 xml 格式,而不是这样:
book null
author null
infos null
#text Empire Burlesque
title null
#text XML Developer's Guide
genre null
#text Computer
price null
#text 44.95
publish_date null
#text 2000-10-01
description null
#text An in-depth look at creating applications
with XML.
book null
author null
infos null
#text Emhhsjshhh
title null
#text Midnight Rain
genre null
#text Fantasy
price null
#text 5.95
publish_date null
#text 2000-12-16
description null
#text A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.
事实上,我想要这样的结果:
<book id="bk101">
<author>
<infos>Empire Burlesque</infos>
</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>
<infos>Emhhsjshhh</infos>
</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
请任何人帮忙:),并在此先感谢。
【问题讨论】:
-
你已经用 JDOM 标记了你的问题,但它不是 JDOM,它只是 DOM。 JDOM 将使整个事物看起来大不相同,而且可能更加自然。您应该使用 DOM,还是应该使用 JDOM? DOM tutorial 和 JDOM tutorial
-
这是 dom :D 抱歉!