【问题标题】：Unable to access span class while web-scraping using Beautifulsoup使用 Beautifulsoup 进行网络抓取时无法访问 span 类
【发布时间】：2021-12-02 17:13:31
【问题描述】：

我正在尝试从该站点提取玩家数量数据 - https://boardgamegeek.com/boardgame/174430/gloomhaven/stats。

from bs4 import BeautifulSoup as bs
import requests
url2 = "https://boardgamegeek.com/boardgame/174430/gloomhaven"
page3 = requests.get(url2)
s2 = bs(page3.content,"html.parser")
var2 = s2.find_all('span',{'class':'ng-scope ng-isolate-scope'})

当我尝试使用此代码时，它总是在 var2 处返回一个空列表。我什至尝试访问 'span' 所属的 'div' 类，但我仍然得到一个空列表。这是为什么呢？

提前致谢。

【问题讨论】：

标签： python web-scraping beautifulsoup

【解决方案1】：

url 由 javascript 动态加载。如果您从浏览器中禁用了 javascript，那么您会注意到 url 中的内容消失了，这就是为什么您在 var2 处得到一个空列表的原因，因为 BeautifulSoup 无法获取数据，因此您需要像 selenium 这样的自动化工具。在这里，我将 selenium 与 BeautifulSoup 一起使用。

由于'class':'ng-scope ng-isolate-scope' 只选择一个元素，所以您需要调用find 方法。

脚本

from bs4 import BeautifulSoup
import time
from selenium import webdriver

driver = webdriver.Chrome('chromedriver.exe')
driver.maximize_window()
time.sleep(8)

url = 'https://boardgamegeek.com/boardgame/174430/gloomhaven/stats'
driver.get(url)
time.sleep(5)

soup = BeautifulSoup(driver.page_source, 'lxml')
var2 = soup.find('span',{'class':'ng-scope ng-isolate-scope'}).text
print(var2)

输出

1–4

【讨论】：