【问题标题】:Python- convert xml to csvPython - 将 xml 转换为 csv
【发布时间】:2020-01-04 16:22:50
【问题描述】:

python 新手,我目前正在使用 Python 3.7 将 XML 转换为 CSV。

输入文件是一个 XML 文件:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <devDetails>
     <hdrTitle>00001</hdrTitle>
     <Type>IN</Type>
  </devDetails>
  <Type>IN</Type>
  <TimeZone>123</TimeZone>
  <nextPage>True</nextPage>
  <Data>
     <item>
         <app>http</app>
         <prot>TCP</prot>
         <dscp>Default</dscp>
         <dst>0.0.0.0</dst>
         <src>1.1.1.1</src>
         <port>80</port>
         <dstport>80</dstport>
         <dscpCode>0</dscpCode>
     </item>
     <item>
         <app>https</app>
         <prot>TCP</prot>
         <dscp>Default</dscp>
         <dst>0.0.0.0</dst>
         <src>1.1.1.1</src>
         <port>443</port>
         <dstport>443</dstport>
         <dscpCode>0</dscpCode>
     </item>
     <item>
         <app>https</app>
         <prot>TCP</prot>
         <dscp>Default</dscp>
         <dst>0.0.0.0</dst>
         <src>1.1.1.1</src>
         <port>443</port>
         <dstport>443</dstport>
         <dscpCode>0</dscpCode>
     </item>
  </Data>
     <startTime>0000-01-01 00:00</startTime>
     <endTime>0000-01-01 00:00</endTime>
     <fromRaw>False</fromRaw>
</root>

Python 代码:

import pandas as pd
from xml.etree import ElementTree
import os, csv

os.chdir("Change the working directory")
tree = ElementTree.parse('a.xml')

sitescope_data = open('b.csv','w',newline='',encoding='utf-8')
csvwriter = csv.writer(sitescope_data)

col_names=['app','prot','dscp','dst','src','port','dstport','dscpCode']
csvwriter.writerow(col_names)
root = tree.getroot()

for Data in root.findall('Data'):
    event_data= []
    event = Data.find('item')

    app = event.find('app')
    if app != None :
        app = app.text
    event_data.append(app)

    prot = event.find('prot')
    if prot != None :
        prot = prot.text
    event_data.append(prot)

    dscp = event.find('dscp')
    if dscp != None :
        dscp = dscp.text
    event_data.append(dscp)

    dst = event.find('dst')
    if dst != None :
        dst = dst.text
    event_data.append(dst)

    src = event.find('src')
    if src != None :
        src = src.text
    event_data.append(src)

    port = event.find('port')
    if port != None :
        port = port.text
    event_data.append(port)

    dstport = event.find('dstport')
    if dstport != None :
        dstport = dstport.text
    event_data.append(dstport)

    dscpCode = event.find('dscpCode')
    if dscpCode != None :
        dscpCode = dscpCode.text
    event_data.append(dscpCode)

    csvwriter.writerow(event_data)

sitescope_data.close()
dataframe = pd.read_csv('b.csv')
print(dataframe.shape)

问题在于它仅将部分元素转换为 CSV,而不是将所有 XML 文件转换为 CSV。请告诉我解决方案。

【问题讨论】:

    标签: python-3.x xml


    【解决方案1】:

    代码迭代Data而不是Data\item,因此它循环一次并仅找到第一个item

    for 循环更改为:

    for event in root.findall('Data/item'):
        event_data= []
        # event = Data.find('item')   # don't need this line
    

    更紧凑的版本(Python 3.8):

    import pandas as pd
    from xml.etree import ElementTree as et
    import csv
    
    tree = et.parse('a.xml')
    with open('b.csv','w',newline='',encoding='utf8') as sitescope_data:
        csvwriter = csv.writer(sitescope_data)
        col_names = 'app prot dscp dst src port dstport dscpCode'.split()
        csvwriter.writerow(col_names)
        for event in tree.findall('Data/item'):
            event_data = ['' if (e:=event.find(col)) is None else e.text for col in col_names]
            csvwriter.writerow(event_data)
    
    dataframe = pd.read_csv('b.csv',encoding='utf8')
    print(dataframe.shape)
    

    输出:

    (3, 8)
    

    b.csv:

    app,prot,dscp,dst,src,port,dstport,dscpCode
    http,TCP,Default,0.0.0.0,1.1.1.1,80,80,0
    https,TCP,Default,0.0.0.0,1.1.1.1,443,443,0
    https,TCP,Default,0.0.0.0,1.1.1.1,443,443,0
    

    【讨论】:

    • 嗨,马克 Tolonen,非常感谢
    猜你喜欢
    • 1970-01-01
    • 2017-09-16
    • 2016-12-21
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-07-27
    相关资源
    最近更新 更多