【发布时间】:2020-11-13 02:31:17
【问题描述】:
我尝试解析的 XML 文件位于 here。此 XML 具有已定义的名称空间。下面是来自 XML 文件的示例,其中包含 pertintnet 元素:
<series>
<header>
<type>instantaneous</type>
<locationId>Fredericton</locationId>
<parameterId>HG</parameterId>
<timeStep unit="second" multiplier="3600"/>
<startDate date="2020-05-11" time="07:00:00"/>
<endDate date="2020-05-15" time="07:00:00"/>
<missVal>-999</missVal>
<stationName>SAINT JOHN RIVER AT FREDERICTON</stationName>
<units>M</units>
</header>
<event date="2020-05-11" time="07:00:00" value="4.69" flag="0"/>
<event date="2020-05-11" time="08:00:00" value="4.66" flag="0"/>
<event "many records deleted to save space"/>
<event date="2020-05-15" time="06:00:00" value="4.27" flag="0"/>
<event date="2020-05-15" time="07:00:00" value="-999" flag="8"/>
</series>
我需要通过存储在 <locationId> 元素中的文本来搜索 XML 文件,例如“Fredericton”。找到“Fredericton”后,我需要提取 <parmeterId> 文本,还需要从第一个和最后一个 <event> 元素中获取属性。这是我到目前为止的代码。如何使用 XPath 获取我需要的元素?我注释掉了我的尝试,但没有成功。
import os
from xml.etree import ElementTree as ET
file_name = 'StJohn_FEWSNB_export.xml'
full_file = os.path.abspath(os.path.join('data', file_name))
print(full_file)
tree = ET.parse(full_file)
root = tree.getroot()
location_lst = [
'Nashwaak','Kennebecasis','Fredericton','Maugerville','Jemseg','Grand_Lake',
'Lakeville_Corner','Gagetown','Oak_Point','Hampton','Saint_John','Connors',
'St_Francois','Ft_Kent','Baker_Brook','St_Hilaire','Edmundston','Iroquois',
'St_Basile','St_Anne','St_Leonard','Perth','Simonds','Hartland','Woodstock'
]
for loc in location_lst:
for location in root.iter('{http://www.wldelft.nl/fews/PI}locationId'):
if location.text == loc:
## type = element.findall('.//{http://www.wldelft.nl/fews/PI}parameterId')
print(loc, location.text)
谢谢, 伯尼。
【问题讨论】:
标签: python-3.x xml xpath elementtree