【问题标题】:How to parse information between {} on web page using Beautifulsoup如何使用 Beautifulsoup 解析网页上 {} 之间的信息
【发布时间】:2021-03-02 00:00:34
【问题描述】:

我正在从网页上抓取体育赔率,如下所示,我从查找请求中获取结果。 .get_text() 将显示 -110 这很好。

如果我想获取{} 中的任何数字怎么办。我将如何获得这些值?

我故意从下面的第一个 div 语句中删除了开始 以便它出现。

results = soup.find('div', attrs={'class':'op-item spread-price'})

print(results)

div class="op-item spread-price" data-op-info='{"fullgame":"-110","firsthalf":"-121","secondhalf":"-115","firstquarter":"-109","secondquarter":"","thirdquarter":"","fourthquarter":""}' data-op-overprice='{"fullgame":"-110","firsthalf":"-109","secondhalf":"-122","firstquarter":"-103","secondquarter":"","thirdquarter":"","fourthquarter":""}'>-110</div

screen

【问题讨论】:

    标签: python web-scraping beautifulsoup jupyter-notebook tags


    【解决方案1】:

    长话短说 - 这应该说明要做什么

    from bs4 import BeautifulSoup
        
    html_doc = """<div class="op-item spread-price" data-op-info='{"fullgame":"-110","firsthalf":"-121","secondhalf":"-115","firstquarter":"-109","secondquarter":"","thirdquarter":"","fourthquarter":""}' data-op-overprice='{"fullgame":"-110","firsthalf":"-109","secondhalf":"-122","firstquarter":"-103","secondquarter":"","thirdquarter":"","fourthquarter":""}'>-110</div>"""
    
       
    bs = BeautifulSoup(html_doc)
    data = bs.find('div')
    
    print(data['data-op-info'])
    print(data['data-op-overprice'])
    

    根据您“提供”的信息

    print(result['data-op-info'])
    print(result['data-op-overprice'])
    

    根据您的评论,您可以将打印替换为

    import json
    for k,v in json.loads(result['data-op-info']).items():
        print(k,v)
    

    希望有帮助,请告诉我们

    【讨论】:

    • print(results['data-op-info']) {"fullgame":"-110","firsthalf":"-121","secondhalf":"-115","第一季度":"-109","第二季度":"","第三季度":"","第四季度":""} print(results['data-op-overprice']) {"fullgame":"-110 ","上半场":"-109","下半场":"-122","第一季度":"-103","第二季度":"","第三季度":"","第四季度":""}这是我在使用提供的打印报表后得到的我如何从那里得到这个信息完整游戏 -110 上半场 -121 下半场 -115 第一节 -109 第二节“”第三节“”第四节“”这样的东西
    • @Rick Sweeney - 看看我的答案的最后一部分,我附加了
    • @RickSweeney 有很多方法,但我认为将其转换为字典和循环是一种常见的方法。
    • for k,v in json.loads(data['data-op-info']).items(): print(k,v) ----------- -------------------------------------------------- -------------- TypeError Traceback (最近一次调用最后) in ----> 1 for k,v in json.loads( data['data-op-info']).items(): 2 print(k,v) TypeError: list indices must be integers or slices, not str
    • @RickSweeney 导入了 json 吗? from bs4 import BeautifulSoup html_doc = """&lt;div class="op-item spread-price" data-op-info='{"fullgame":"-110","firsthalf":"-121","secondhalf":"-115","firstquarter":"-109","secondquarter":"","thirdquarter":"","fourthquarter":""}' data-op-overprice='{"fullgame":"-110","firsthalf":"-109","secondhalf":"-122","firstquarter":"-103","secondquarter":"","thirdquarter":"","fourthquarter":""}'&gt;-110&lt;/div&gt;""" bs = BeautifulSoup(html_doc) data = bs.find('div') import json for k,v in json.loads(data['data-op-info']).items(): print(k,v)
    猜你喜欢
    • 2020-02-25
    • 1970-01-01
    • 2012-04-20
    • 2017-03-08
    • 1970-01-01
    • 2023-01-03
    • 2012-10-04
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多