【问题标题】:XML to CSV using Python使用 Python 将 XML 转换为 CSV
【发布时间】:2017-09-16 07:48:54
【问题描述】:

我有想要使用 Python 将其转换为 CSV 的 XML 文件。我需要Testitemname 标记中的内容作为CSV 标头和Testvalue 标记中的内容作为CSV 中的值。有人可以帮我解决这个问题吗?

示例 XML 文件(输入)

<sample:batch xmlns:sample="http://sample.com/schema/sampleimport">
    <sample:TestData>
        <sample:Testitem>
            <sample:TestitemName>Field1</sample:TestitemName>
            <sample:Testvalue>1</sample:Testvalue>
        </sample:Testitem>
        <sample:Testitem>
            <sample:TestitemName>Field2</sample:TestitemName>
            <sample:Testvalue>Hi</sample:Testvalue>
        </sample:Testitem>
        <sample:Testitem>
            <sample:TestitemName>Field3</sample:TestitemName>
            <sample:Testvalue>1234</sample:Testvalue>
        </sample:TestData>
        <sample:TestData>
        <sample:Testitem>
            <sample:TestitemName>Field1</sample:TestitemName>
            <sample:Testvalue>3</sample:Testvalue>
        </sample:Testitem>
        <sample:Testitem>
            <sample:TestitemName>Field2</sample:TestitemName>
            <sample:Testvalue>Hello</sample:Testvalue>
        </sample:Testitem>
        <sample:Testitem>
            <sample:TestitemName>Field3</sample:TestitemName>
            <sample:Testvalue>999</sample:Testvalue>
        </sample:TestData>

所需的 CSV 文件(输出)

Field1,Field2,Filed3 (Header field names)
1,Hi,1234 (1st record)
3,Hello,999 (2nd record)

【问题讨论】:

    标签: python xml python-3.x csv beautifulsoup


    【解决方案1】:

    BeautifulSoup 可用于解析 XML 数据。有了组织良好的数据,您只需要遍历嵌套的标签类型并随时收集数据。

    代码:

    from BeautifulSoup import BeautifulSoup as Soup
    
    def parse_xml(file_like):
        data = []
        names = []
        soup = Soup(file_like)
        for batch in soup.findAll('sample:batch'):
            for test_data in batch.findAll('sample:testdata'):
                item = {}
                for test_item in test_data.findAll('sample:testitem'):
                    name = test_item.find('sample:testitemname').text
                    value = test_item.find('sample:testvalue').text
                    item[name] = value
                    if name not in names:
                        names.append(name)
                data.append(item)
    
        return [names] + [[datum.get(name) for name in names] for datum in data]
    

    测试代码:

    data = parse_xml(xml_data)
    for datum in data:
        print(','.join(datum))
    

    测试数据:

    from io import StringIO
    xml_data = StringIO(u"""
        <sample:batch xmlns:sample="http://sample.com/schema/sampleimport">
            <sample:TestData>
                <sample:Testitem>
                    <sample:TestitemName>Field1</sample:TestitemName>
                    <sample:Testvalue>1</sample:Testvalue>
                </sample:Testitem>
                <sample:Testitem>
                    <sample:TestitemName>Field2</sample:TestitemName>
                    <sample:Testvalue>Hi</sample:Testvalue>
                </sample:Testitem>
                <sample:Testitem>
                    <sample:TestitemName>Field3</sample:TestitemName>
                    <sample:Testvalue>1234</sample:Testvalue>
            </sample:TestData>
            <sample:TestData>
                <sample:Testitem>
                    <sample:TestitemName>Field1</sample:TestitemName>
                    <sample:Testvalue>3</sample:Testvalue>
                </sample:Testitem>
                <sample:Testitem>
                    <sample:TestitemName>Field2</sample:TestitemName>
                    <sample:Testvalue>Hello</sample:Testvalue>
                </sample:Testitem>
                <sample:Testitem>
                    <sample:TestitemName>Field3</sample:TestitemName>
                    <sample:Testvalue>999</sample:Testvalue>
                </sample:TestItem>
            </sample:TestData>
        </sample:batch>
    """)
    

    结果:

    Field1,Field2,Field3
    1,Hi,1234
    3,Hello,999
    

    【讨论】:

    • 感谢斯蒂芬,它成功了!我想将输出写入 CSV 文件。你能再帮我一次吗?
    • 我显示的输出是 CSV... 只需写入文件而不是打印到屏幕
    【解决方案2】:

    使用 pyxmlparser

    这是一个命令行实用程序来做同样的事情!

    https://pypi.org/project/pyxmlparser/

    免责声明:我是图书馆的作者。由于它是新的,我很高兴知道它是否有效。

    【讨论】:

      猜你喜欢
      • 2016-12-21
      • 1970-01-01
      • 1970-01-01
      • 2019-07-27
      • 2013-03-29
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多