【问题标题】:How to get xml elements which have childs with a certain tag and attribute如何获取具有特定标签和属性的子元素的xml元素
【发布时间】:2021-05-24 16:37:13
【问题描述】:

我想查找具有某些子元素的 xml 元素。子元素需要有一个给定的标签和一个设置为特定值的属性。

举一个具体的例子(基于official documentation)。我想找到所有具有neighbor 属性name="Austria" 的子元素的country 元素:

import xml.etree.ElementTree as ET

data = """<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <neighbor name="Malaysia" direction="N"/>
        <partner name="Austria"/>
    </country>
    <country name="Panama">
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>
"""

root = ET.fromstring(data)

我尝试过的没有成功:

countries1 = root.findall('.//country[neighbor@name="Austria"]')
countries2 = root.findall('.//country[neighbor][@name="Austria"]')
countries3 = root.findall('.//country[neighbor[@name="Austria"]]')

全部给出:

SyntaxError: 无效谓词


以下解决方案显然是错误的,因为发现的元素太多:

countries4 = root.findall('.//country/*[@name="Austria"]')
countries5 = root.findall('.//country/[neighbor]')

其中countries4 包含所有具有name="Austria" 属性的元素,但包括partner 元素。 countries5 包含所有具有 any 相邻元素作为子元素的元素。

【问题讨论】:

    标签: python python-3.x elementtree


    【解决方案1】:
    import xml.etree.ElementTree as ET
    
    data = """<?xml version="1.0"?>
    <data>
        <country name="Liechtenstein">
            <neighbor name="Austria" direction="E"/>
            <neighbor name="Switzerland" direction="W"/>
        </country>
        <country name="Singapore">
            <neighbor name="Malaysia" direction="N"/>
            <partner name="Austria"/>
        </country>
        <country name="Panama">
            <neighbor name="Costa Rica" direction="W"/>
            <neighbor name="Colombia" direction="E"/>
        </country>
        <country name="Liechtenstein">
            <neighbor name="Austria" direction="dummy"/>
            <neighbor name="Switzerland" direction="W"/>
        </country>
        
    </data>
    """
    
    root = ET.fromstring(data)
    for x in root.findall(".//country/neighbor[@name='Austria']"):
        print(x.attrib)
    

    输出:

    {'name': 'Austria', 'direction': 'E'}
    {'name': 'Austria', 'direction': 'dummy'}
    

    // : Selects all subelements, on all levels beneath the current element. For example, .//egg selects all egg elements in the entire tree.

    [@attrib='value'] : Selects all elements for which the given attribute has the given value. The value cannot contain quotes

    for x in root.find('.'):
        if x[0].attrib['name'] == 'Austria':
                       print(x.attrib['name']) 
    

    输出: Liechtenstein

    【讨论】:

    • 感谢您的回答。但是,您的解决方案将为所有具有属性name="Austria" 的相邻元素提供。我对小时候有这些元素的国家元素感兴趣。
    • ".//country/neighbor[@name='Austria']" 这应该可以工作
    • 这也会找到邻居元素,但不会找到国家元素。
    • 我添加了一个额外的country tag
    • 我们应该使用 2 for loops 并且我已经更新了..这将为您提供带有 [@name='Austria'] 的所有标签
    【解决方案2】:

    我想查找所有具有属性 name="Austria" 的子元素邻居的国家/地区元素

    见下文

    import xml.etree.ElementTree as ET
    
    data = """<?xml version="1.0"?>
    <data>
        <country name="Liechtenstein">
            <neighbor name="Austria" direction="E"/>
            <neighbor name="Switzerland" direction="W"/>
        </country>
        <country name="Singapore">
            <neighbor name="Malaysia" direction="N"/>
            <partner name="Austria"/>
        </country>
        <country name="Panama">
            <neighbor name="Costa Rica" direction="W"/>
            <neighbor name="Colombia" direction="E"/>
        </country>
    </data>
    """
    
    root = ET.fromstring(data)
    countries_with_austria_as_neighbor = [c.attrib['name'] for c in root.findall('.//country') if
                                          'Austria' in [n.attrib['name'] for n in c.findall('neighbor')]]
    print(countries_with_austria_as_neighbor)
    

    输出

    ['Liechtenstein']
    

    【讨论】:

    • 我太专注于使用单个 xpath 表达式来解决它,以至于我忘记了所有其他可能的方法。列表推导在这里绝对有意义。
    猜你喜欢
    • 2012-01-16
    • 1970-01-01
    • 2012-03-03
    • 1970-01-01
    • 2011-10-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-02-13
    相关资源
    最近更新 更多