【问题标题】:Parse XML file in Python 2.x在 Python 2.x 中解析 XML 文件
【发布时间】:2018-12-22 14:00:03
【问题描述】:

我有一个 XML 文件 song.xml,部分如下所示:

<?xml version="1.0" encoding="utf-8"?>
<Event status="happened">
<Song title="Erase and rewind">
<Artist name="The Cardigans" ID="340900">
</Artist>
<Info StartTime="22:22:13" JazlerID="8310" 
 PlayListerID="" />
</Song>
</Event>

我需要解析此类文件并获取所有字段,例如: 歌名: 艺术家: 开始时间: 编号:

我尝试这样的事情,但我只得到每首歌的标题:

#!/usr/bin/python
from xml.dom.minidom import parse
import xml.dom.minidom

# Open XML document using minidom parser
DOMTree = xml.dom.minidom.parse("songs.xml")
Event  = DOMTree.documentElement
if Event.hasAttribute("happened"):
   print "Root element : %s" % collection.getAttribute("happened")

# Get all the songs in the collection
songs = Event.getElementsByTagName("Song")
x = Event.getElementsByTagName("*").length
print x

# Print detail of each song.
for song in songs:
   print "*****Song*****"
   if song.hasAttribute("title"):
      print "Title: %s" % song.getAttribute("title")

我需要这个稍后将它们保存在数据库中 谢谢

【问题讨论】:

    标签: python xml parsing


    【解决方案1】:

    你可以使用xml.etree.ElementTree来解析XML文件:

    import xml.etree.ElementTree as ET
    
    tree = ET.parse('songs.xml')
    root = tree.getroot()
    
    for child in root:
        print(child.tag, child.attrib)
    
        for x in child:
            print(x.tag, x.attrib)
    

    打印出来的:

    Song {'title': 'Erase and rewind'}
    Artist {'name': 'The Cardigans', 'ID': '340900'}
    Info {'StartTime': '22:22:13', 'JazlerID': '8310', 'PlayListerID': ''}
    

    左侧打印 XML 标记,右侧打印存储在字典中的数据。您可以从这些字典中访问数据。

    如果你只想打印不是None或空字符串的值,你可以试试这个:

    import xml.etree.ElementTree as ET 
    
    tree = ET.parse('songs.xml') 
    root = tree.getroot() 
    
    for child in root:
        title = child.attrib.get("title")
        if title:
            print('title = %s' % title)
    
        for x in child:
            for key in x.attrib:
                value = x.attrib.get(key)
                if value:
                    print(key, "=", value)
    

    这给出了:

    title = Erase and rewind
    name = The Cardigans
    ID = 340900
    StartTime = 22:22:13
    JazlerID = 8310
    

    【讨论】:

      【解决方案2】:

      根据 RoadRunners 的建议,满足我需要的最终答案如下:

      import xml.etree.ElementTree as ET
      
      tree = ET.parse('songs1.xml')
      root = tree.getroot()
      
      
      for child in root:
          #print(child.tag, child.attrib)
          #print(child.attrib.get("title"))
      
      
          print(child.attrib.get("title"))
      
          for x in child:
              if x.tag == "Artist":
                  print(x.tag)
                  #print(dic_artist)
                  dic_artist = x.attrib
                  print(dic_artist.get("name"))
                  print(dic_artist.get("ID"))
              if x.tag == "Info":
                  print(x.tag)
                  #print(dic_info)
                  dic_info = x.attrib
                  print(dic_info.get("StartTime"))
                  print(dic_info.get("JazlerID"))
                  #print(dic_info.get("PlayListerID"))
          print("-------------------------------")
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-11-15
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多