【问题标题】:Python - Parsing XML data with ElementTreePython - 使用 ElementTree 解析 XML 数据
【发布时间】:2020-05-07 21:15:18
【问题描述】:

我正在为我的 Discord 机器人开发一个模块,该模块将从 URL 获取数据并将其分类到嵌入中。我花了几个小时尝试不同的方法来让它工作,我设法让它显示我需要的东西。现在,当我从 XML 更改为 XML2 URL 时出现了一些问题(我需要更多数据),它只是不想再工作了。

import xml.etree.ElementTree as ET
import requests


tree = ET.fromstring(requests.get('http://vatbook.euroutepro.com/xml.php?fir=LJLA').text)
#Testing what is displayed
for atcs in tree:
    callsign = atcs.find('callsign')
    name = atcs.find('name')
    time_start = atcs.find('time_start')
    time_end = atcs.find('time_end')
    if callsign is not None:
        print(f"{name.text} booked {callsign.text} from {time_start.text} to {time_end.text}")

输出:

Mirza Ibrahimovic booked LJLJ_TWR from 2020-05-19 1800 to 2020-05-19 2100
Mirza Ibrahimovic booked LJLJ_APP from 2020-05-19 1800 to 2020-05-19 2100

我的问题是,当我将 first url 替换为 secound url 时,我的代码不会显示任何内容。有什么想法吗?

【问题讨论】:

    标签: python xml parsing python-requests


    【解决方案1】:

    由于两个 URL 的结构不同,请考虑有条件地检查 atc 节点是否存在,然后将动态搜索路径传递给 iterfind。下面使用内置的urllib 模块从 URL 解析 XML:

    from urllib.request import urlopen
    import xml.etree.ElementTree as ET
    
    def vatbook_parse(url):
        with urlopen(url) as f:
            tree = ET.parse(f)
            root = tree.getroot()
    
            # CONDITIONALLY SET SEARCH PATH
            path = './/atcs/booking' if tree.find('atc') is None else './/atc'
    
            for atcs in root.iterfind(path):
                callsign = atcs.find('callsign')
                name = atcs.find('name')
                time_start = atcs.find('time_start')
                time_end = atcs.find('time_end')
    
                if callsign is not None:
                    print(f"{name.text} booked {callsign.text} from {time_start.text} to {time_end.text}")
    

    第一个网址

    vatbook_parse('http://vatbook.euroutepro.com/xml.php?fir=LJLA')
    
    # Mirza Ibrahimovic booked LJLJ_APP from 2020-05-19 18:00:00 to 2020-05-19 21:00:00
    # Mirza Ibrahimovic booked LJLJ_TWR from 2020-05-19 18:00:00 to 2020-05-19 21:00:00
    

    第二个网址

    vatbook_parse('http://vatbook.euroutepro.com/xml2.php?fir=LJLA')
    
    # Mirza Ibrahimovic booked LJLJ_APP from 2020-05-19 18:00:00 to 2020-05-19 21:00:00
    # Mirza Ibrahimovic booked LJLJ_TWR from 2020-05-19 18:00:00 to 2020-05-19 21:00:00
    

    【讨论】:

    • 它也像我的评论中提到的那样工作。我忘了添加要查找的内容,这就是为什么我无法得到回复。当我添加 for atcs in root.find('atcs') 时,它按预期工作。我修改了代码以在不和谐服务器中显示为嵌入,但感谢您的帮助
    【解决方案2】:

    我发现我忘记添加一小部分代码以使其运行。

    所以这是我的解决方案:

    import xml.etree.ElementTree as ET
    
    import requests
    
    atc = ["ADR_CTR", "ADR_W_CTR", "ADR_U_CTR", "ADR_E_CTR", "LDZO_CTR", "LJLA_CTR", "LYBA_CTR", "LWSS_CTR", "LAAA_CTR", "LQSB_CTR", "LJLJ_TWR", "LJLJ_APP", "LJLJ_GND", "LJMB_TWR", "LJMB_APP", "LJPZ_TWR", "LJPZ_APP", "LDZA_APP", "LDZA_TWR", "LDZA_GND", "LDDU_TWR", "LDDU_APP", "LDSP_TWR", "LDSP_APP", "LDPL_TWR", "LDPL_APP", "LDRI_TWR", "LDZD_TWR", "LDZD_APP", "LDOS_TWR", "LDOS_APP", "LYBE_APP",
           "LYBE_TWR", "LYBE_GND", "LYTV_TWR", "LYPG_TWR", "LYPG_APP", "LYNI_TWR", "LYNI_APP", "LATI_APP", "LATI_TWR", "LATI_GND", "LWSK_TWR", "LWSK_APP", "LWSK_GND", "LWOH_TWR", "BKPR_TWR", "BKPR_APP", "LQSA_TWR", "LQSA_GND", "LQSA_APP", "LQMO_TWR", "LQMO_APP", "LQBK_TWR", "LQBK_APP", "LQTZ_TWR", "LQTZ_APP", "LYUZ_TWR", "LYUZ_APP", "LYKV_APP", "LYKV_TWR", "LDZO_T_CTR", "LJLA_T_CTR", "LYBA_T_CTR", 
           "LWSS_T_CTR", "LAAA_T_CTR", "LQSB_T_CTR", "LJLJ_T_TWR", "LJLJ_T_APP", "LJLJ_T_GND", "LJMB_T_TWR", "LJMB_T_APP", "LJPZ_T_TWR", "LJPZ_T_APP", "LDZA_T_APP", "LDZA_T_TWR", "LDZA_T_GND", "LDDU_T_TWR", "LDDU_T_APP", "LDSP_T_TWR", "LDSP_T_APP", "LDPL_T_TWR", "LDPL_T_APP", "LDRI_T_TWR", "LDZD_T_TWR", "LDZD_T_APP", "LDOS_T_TWR", "LDOS_T_APP", "LYBE_T_APP",
           "LYBE_T_TWR", "LYBE_T_GND", "LYTV_T_TWR", "LYPG_T_TWR", "LYPG_T_APP", "LYNI_T_TWR", "LYNI_T_APP", "LATI_T_APP", "LATI_T_TWR", "LATI_T_GND", "LWSK_T_TWR", "LWSK_T_APP", "LWSK_T_GND", "LWOH_T_TWR", "BKPR_T_TWR", "BKPR_T_APP", "LQSA_T_TWR", "LQSA_T_GND", "LQSA_T_APP", "LQMO_T_TWR", "LQMO_T_APP", "LQBK_T_TWR", "LQBK_T_APP", "LQTZ_T_TWR", "LQTZ_T_APP", "LYUZ_T_TWR", "LYUZ_T_APP", "LYKV_T_APP", "LYKV_T_TWR"]
    
    tree = ET.fromstring(requests.get('http://vatbook.euroutepro.com/xml2.php?fir=').text)
    #Testing what is displayed
    for atcs in tree.find('atcs'):
        callsign = atcs.find('callsign')
        name = atcs.find('name')
        time_start = atcs.find('time_start')
        time_end = atcs.find('time_end')
        if callsign is not None:
        print(f"{name.text} booked {callsign.text} from {time_start.text} to {time_end.text}")
    

    【讨论】:

    • 嗯...解决方案不适用于第一个 URL。第一个 XML 中没有 atcs 节点,但有 atc。而长长的atc 列表有什么用?
    • 是的解决方案不适用于第一个 URL,但对于第二个有效,我需要它,因为第二个提供更多数据并且没有义务提供 FIR,因此我可以收集数据多个 FIR。我不需要关于 ATCS 或飞行员的所有数据,我只需要其中的一些。此列表用于驱动某些软件以在日历中显示预订,我想获取该数据并将其发送到我的 Discord 服务器。
    猜你喜欢
    • 2021-02-06
    • 2017-08-31
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-03-16
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多