【发布时间】:2011-01-01 21:43:37
【问题描述】:
当我尝试解析 XML 文件时,它有时会在标题旁给出一个空元素。
我认为这与 HTML 标签有关 '
我该如何解决这个问题?
我有以下 XML 文件:
<item>
<title>' Nieuwe DVD '</title>
<description>tekst, tekst tekst</description>
<link>dvd.html</link>
<category>nieuws</category>
<pubDate>Sat, 1 Jan 2011 9:24:00 +0000</pubDate>
</item>
以及解析xml文件的以下代码:
//DocumentBuilderFactory, DocumentBuilder are used for
//xml parsing
DocumentBuilderFactory dbf = DocumentBuilderFactory
.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
//using db (Document Builder) parse xml data and assign
//it to Element
Document document = db.parse(is);
Element element = document.getDocumentElement();
//take rss nodes to NodeList
element.normalize();
NodeList nodeList = element.getElementsByTagName("item");
if (nodeList.getLength() > 0)
{
for (int i = 0; i < nodeList.getLength(); i++)
{
//take each entry (corresponds to <item></item> tags in
//xml data
Element entry = (Element) nodeList.item(i);
entry.normalize();
Element _titleE = (Element) entry.getElementsByTagName(
"title").item(0);
Element _categoryE = (Element) entry
.getElementsByTagName("category").item(0);
Element _pubDateE = (Element) entry
.getElementsByTagName("pubDate").item(0);
Element _linkE = (Element) entry.getElementsByTagName(
"link").item(0);
String _title = _titleE.getFirstChild().getNodeValue();
String _category = _categoryE.getFirstChild().getNodeValue();
Date _pubDate = new Date(_pubDateE.getFirstChild().getNodeValue());
String _link = _linkE.getFirstChild().getNodeValue();
//create RssItemObject and add it to the ArrayList
RssItem rssItem = new RssItem(_title, _category, _pubDate, _link);
rssItems.add(rssItem);
conn.disconnect();
}
【问题讨论】:
-
使用您的代码和数据对我来说很好
-
嗯..奇怪...你得到''新DVD ''在字符串 _title 中?
-
你使用的是什么 JAXP 实现?
标签: java xml xml-parsing