【问题标题】:Parse XML into Pandas Dataframe, Python 3.8, ElementTree将 XML 解析为 Pandas Dataframe、Python 3.8、ElementTree
【发布时间】:2021-01-16 01:12:25
【问题描述】:

在 Python 3.8 中使用 ElementTree,如何将数据转换为 Pandas 数据框?

示例 XML:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<SystemSourceSet id="-1" name="UC33brAvg_FM31" weight="0.5">
  <!-- This model is an example and for review purposes only -->
  <!-- Reference: UC33brAvg_FM31 -->
  <!-- Description: UCERF 3.3 Branch Averaged Solution (FM31)-->
  <Settings>
    <DefaultMfds>
      <IncrementalMfd floats="false" m="6.5" rate="0.0" type="SINGLE" weight="1.0"/>
    </DefaultMfds>
  </Settings>
  <Source>
    <IncrementalMfd m="6.449" rate="3.3631184e-05" type="SINGLE"/>
    <Geometry depth="1.3" dip="50.0" indices="0:1" rake="-90.0" width="15.273"/>
  </Source>
  <Source>
    <IncrementalMfd m="6.638" rate="1.5340160e-05" type="SINGLE"/>
    <Geometry depth="1.3" dip="50.0" indices="0:2" rake="-90.0" width="15.273"/>
  </Source>
  <Source>
    <IncrementalMfd m="6.78" rate="1.0903030e-05" type="SINGLE"/>
    <Geometry depth="1.3" dip="50.0" indices="0:3" rake="-90.0" width="15.273"/>
  </Source>
  <Source>
    <IncrementalMfd m="6.893" rate="7.3397665e-06" type="SINGLE"/>
    <Geometry depth="1.3" dip="50.0" indices="0:4" rake="-90.0" width="15.273"/>
  </Source>

预期的数据框:

【问题讨论】:

    标签: python-3.x xml pandas xml-parsing elementtree


    【解决方案1】:

    手动导航树并收集您要保留的数据点:

    from xml.etree import ElementTree
    
    root = ElementTree.parse('data.xml').getroot()
    data = []
    for node in root:
        if node.tag != 'Source':
            continue
        
        mfd = node.find('IncrementalMfd')
        geometry = node.find('Geometry')
        data.append({
            'indices': geometry.get('indices'),
            'IncrementalMfd m': mfd.get('m'),
            'rate': mfd.get('rate'),
            'type': mfd.get('type'),
            'Geometry depth': geometry.get('depth'),
            'dip': geometry.get('dip'),
            'rake': geometry.get('rake'),
            'width': geometry.get('width')
        })
        
    df = pd.DataFrame(data)
    

    【讨论】:

    • 非常巧妙的解决方案。
    猜你喜欢
    • 2023-01-12
    • 1970-01-01
    • 2018-05-18
    • 2023-02-23
    • 2017-01-02
    • 1970-01-01
    • 2021-02-06
    • 2017-08-31
    • 1970-01-01
    相关资源
    最近更新 更多