【发布时间】:2015-05-25 02:53:42
【问题描述】:
我正在尝试解析一个相对复杂的(无论如何对我而言!)XML 文件。我以前在类似的主题中发过帖子,对此有所了解。然而,这给我带来了问题。我的 XML 文件的摘录:
<?xml version="1.0" ?>
<record number="1" type="custID" first-time="Wed Feb 4 19:22:57 2014" last-time="Fri Feb 7 10:11:02 2015">
<Customer name="Bob Janotior" custID="4466851">
<type>Monthly</type>
<max-books>5</max-books>
<rental status="false">overdue</essid>
</Customer>
<book title="All The Things" type="fiction" author="Jill Taylor" pubID="7744jh566lp">
<cover>softback</cover>
<pub>Penguin</pub>
</book>
<book title="Mellow Tides of War" type="non-fiction" author="Prof. Lambert et al" pubID="7744gd556se">
<cover>hardback</cover>
<pub>Penguin</pub>
</book>
</record>
<record number="2" type="custID" first-time="Wed Apr 8 15:23:54 2012" last-time="Fri Feb 7 10:11:02 2015">
<Customer name="Jayne Wrikcek" custID="4466787">
<type>Monthly</type>
<max-books>5</max-books>
<rental status="false">overdue</essid>
</Customer>
<book title="Kiss Me Hardy" type="fiction" author="AR Jones" pubID="766485gf66ki">
<cover>softback</cover>
<pub>/Kingsoft</pub>
</book>
<book title="Oskar Came Again" type="fiction" author="Johnathan Huphries" pubID="a5555qwd2">
<cover>hardback</cover>
<pub>Lofthouse</pub>
</book>
</record>
所以之前我使用的是我在 Python 2.7 中编写的这个脚本:
from xml.dom.minidom import parse
import xml.dom.minidom
import csv
def writeToCSV(myLibrary):
with open('output.csv', 'wb') as csvfile:
writer = csv.writer(csvfile, delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['title', 'author', 'author'])
books = myLibrary.getElementsByTagName("book")
for book in books:
titleValue = book.getElementsByTagName("title")[0].childNodes[0].data
authors = [] # get all the authors in a vector
for author in book.getElementsByTagName("author"):
authors.append(author.childNodes[0].data)
writer.writerow([titleValue] + authors) # write to csv
doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]
# Print each book's title
writeToCSV(myLibrary)
这个脚本实际上是为一个更简单的 XML 文件编写的。我很难为这个 XML 文件调整它,它(对我来说)结构要复杂得多。我正在慢慢掌握 minidom 和 csv 写作,但这对我来说仍然是新的。这是我想要的 CSV 文件中的那种输出:
这就是我想要的 CSV 文件中的输出类型:
record number,type,Customer name,CustID,type,max-books,rental status,book,title,type,author,
1,custID,Bob Janotoir,4466851,Monthly,5,false,overdue,All The Things,fiction,Jill Taylor,
2,custID,Jayne Wrikcek,4466787,Monthly,5,false,overdue,Kiss Me Hardy,fiction,AR Jones,
【问题讨论】:
标签: python xml python-2.7 csv minidom