带有命名空间的 Python xml 解析器答案

【问题标题】：Python xml parser with namespaces带有命名空间的 Python xml 解析器
【发布时间】：2021-10-24 13:56:19
【问题描述】：

亲爱的，我有一个如下所示的 xml 文件

<?xml version="1.0" encoding="utf-8"?>
<Data xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http:/xxxx//bb/v1 /xyz/it/Data/v1/Data-1_2.xsd" version="1.2" xmlns="http://xx/it//Data/v1">
  <Header>
    <Location>abc</Location>
    <Date start="date-time"/>

我正在尝试解析不同的标签和属性。但是， xmln 似乎弄乱了解析。我正在使用类似的代码

tree = ET.parse(input_filename)
root = tree.getroot()
location = tree.find("./Header/Location").text
time = tree.find("./Header/Date").attrib['start']

当我从输入文件中手动删除

<?xml version="1.0" encoding="utf-8"?>
<Data >
  <Header>
    <Location>abc</Location>
    <Date start="date-time"/>

但保留它会出错

location = tree.find("./Header/Location").text
AttributeError: 'NoneType' object has no attribute 'text'

我尝试了几乎 90% 的先前建议，但仍然没有好的结果。高度赞赏。

【问题讨论】：

标签： python xml namespaces

【解决方案1】：

现代 Python 版本支持命名空间的通配符。考虑一下：-

import xml.etree.ElementTree as ET

xml = '''<?xml version="1.0" encoding="utf-8"?>
<Data xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http:/xxxx//bb/v1 /xyz/it/Data/v1/Data-1_2.xsd" version="1.2" xmlns="http://xx/it//Data/v1">
  <Header>
    <Location>abc</Location>
    <Date start="date-time"/>
  </Header>
</Data>'''

tree = ET.fromstring(xml)

location = tree.find('.//{*}Header/{*}Location').text
_time = tree.find('.//{*}Header/{*}Date').attrib['start']

print(f'Location={location}, time={_time}')

【讨论】：

谢谢，这应该适用于代码，但在我使用 root.iter 的其他部分给我带来了问题这个解决方案 stackoverflow.com/a/18160164/16726552 运行良好，无需更改代码中的任何内容，无论是否使用 xmlns .