BeautifulSoup 没有得到正确的课程

【问题标题】：BeautifulSoup not getting the right classBeautifulSoup 没有得到正确的课程
【发布时间】：2021-04-20 11:32:40
【问题描述】：

https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2

我想刮掉 Token 标签旁边的 lisbox href。

我需要从

抓取 href

class="link-hover d-flex justify-content-between align-items-center"

所以我的代码：

import requests
from bs4 import BeautifulSoup

page = requests.get('https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2').text
html = BeautifulSoup(page, 'html.parser')

href = html.find(class_ = 'link-hover d-flex justify-content-between align-items-center')['href']

但是结果什么都没有。谁能帮我？我真的需要一些帮助。

【问题讨论】：

标签： python beautifulsoup python-requests

【解决方案1】：

我认为使用 requests 库你不能这样做，因为 Cloudflare 检测到自动化。

>>> page = requests.get('https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2')
>>> page.status_code
403

HTTP 403Forbidden 客户端错误状态响应码表示服务器理解请求但拒绝授权。而不是 bs4 尝试 selenium 库。
Page title

>>> soup = BeautifulSoup(page.content, 'html.parser')
>>> soup.title
>>> <title>Attention Required! | Cloudflare</title>

【讨论】：

有没有办法从窗口中隐藏硒，让它看起来不起作用？我用 requests 尝试它的原因是我不想显示 chrome 界面工作
你可以使用 selenium headless 模式。