【发布时间】:2020-11-29 13:54:47
【问题描述】:
仍在学习如何使用 BeautifulSoup,我正在尝试使用 python3 和 BeautifulSoup 从 NFL 网站中获取一些信息。我将网站解析为 lxml:
soup = BeautifulSoup(source, 'lxml')
然后我找到所有的比赛信息:
matchups = soup.findAll("div", {"class": "cmg_game_data cmg_matchup_game_box"})
此时,对战列表中的每场对决都包含大量数据,如下所示:
<div class="cmg_game_data cmg_matchup_game_box" data-away-conference="American Football Conference" data-away-team-city-search="Houston" data-away-team-fullname-search="Houston" data-away-team-nickname-search="Texans" data-away-team-shortname-search="HOU" data-competition-type="Week 1" data-conference="American Football Conference" data-event-id="80767" data-following="false" data-game-date="2020-09-10 20:20:00" data-game-odd="-10" data-game-total="54.5" data-handicap-difference="0" data-home-conference="American Football Conference" data-home-team-city-search="Kansas City" data-home-team-fullname-search="Kansas City" data-home-team-nickname-search="Chiefs" data-home-team-shortname-search="KC" data-index="0" data-last-update="2020-05-07T22:50:26.5700000" data-link="/sport/football/nfl/matchup/201993" data-sdi-event-id="/sport/football/competition:80767" data-top-twenty-five="false">
我想专门获取这些内部类(标签?属性?),例如 data-away-conference 和 data-game-odd。如何解析下一个级别以提取这些项目?我试过了:
for matchup in matchups:
awayconference = matchup.find("data-away-conference")
但是这会返回 None。在
【问题讨论】:
标签: python beautifulsoup