未找到 Web 抓取 Div 类答案

【问题标题】：Web Scraping Div Class Not Found未找到 Web 抓取 Div 类
【发布时间】：2026-02-19 19:35:01
【问题描述】：

我正在尝试在 Chambers.com 上抓取信息，更具体地说，在此示例中为 https://chambers.com/law-firm/allen-overy-llp-global-2:7。我想要的信息是“排名部门”选项卡上“英国”部分下的不同部门和乐队。以下部分图片：

我目前遇到的问题是漂亮的汤的find_all，我假设是解析器。我想找到所有<div class="mb-3">我目前的代码是：

import requests
from bs4 import BeautifulSoup
url_to_scrape = 'https://chambers.com/law-firm/allen-overy-llp-global-2:7'

plain_html_text = requests.get(url_to_scrape)

soup = BeautifulSoup(plain_html_text.content, "lxml")

search = soup.find_all("div", {"class": "mb-3"})

print(search)

并且列表中没有返回任何内容。我已使用浏览器上的检查器从 HTML 中获取课程。

我尝试将 HTML 直接添加到 pyhton 文件中，我也尝试使用 html.parser 但仍然没有返回。

任何帮助都会非常感激，即使它是关于在哪里寻找的建议。

【问题讨论】：

网页抓取中最大的问题之一是客户端渲染。您确定在 Web 浏览器中加载文档后没有一些 javascript 加载此信息吗？您可能需要使用像 Selenium 这样的库。例如见文章here。
感谢您的评论迦勒。我不知道是否有 javascript 加载此信息，有没有办法解决这个问题？我会看看你附上的文章。谢谢阿甘。
我会查看“plain_html_text.content”并根据那里的内容构建搜索查询。
好的，他们正在使用 Angular，这通常是客户端。我还使用curl 请求页面，并且您要查找的数据没有返回，因此您需要使用某种可以网页抓取客户端呈现网站的工具。希望该链接有所帮助，祝你好运

标签： python web-scraping beautifulsoup

【解决方案1】：

查看页面的来源，你会发现这个页面中没有这样的元素。抓取 API：

import requests

url = 'https://api.chambers.com/api/organisations/7/ranked-departments?publicationTypeGroupId=2'
response = requests.get(url).json()
for location in response['locations']:
    if location['description'] == 'UK':
        for info in location['rankedEntities']:
            print(info["displayName"], info['rankings'][0]['rankingDescription'], sep="\n", end="\n\n")

打印：

Banking & Finance: Borrowers
Band 1

Banking & Finance: Lenders
Band 1

Banking & Finance: Sponsors
Band 2

Capital Markets: Debt
Band 1

Capital Markets: Derivatives
Band 1

Capital Markets: Equity
Band 1

Capital Markets: Securitisation
Band 1

Capital Markets: Structured Finance
Band 1

Competition Law
Band 2

Corporate M&A (International & Cross-Border)
Band 1

Dispute Resolution: International Arbitration
Band 2

Dispute Resolution: Litigation
Band 1

Disputes (International & Cross-Border)
Band 1

Employment
Band 2

Energy & Natural Resources: Oil & Gas
Band 1

Energy & Natural Resources: Power
Band 1

Energy & Natural Resources: Renewables & Alternative Energy
Band 1

Energy Sector (International & Cross-Border)
Band 1

Finance & Capital Markets (International & Cross-Border)
Band 1

Insurance: Mainly Policyholders
Band 1

Intellectual Property
Band 2

Intellectual Property: Patent Litigation
Band 1

Investigations & Enforcement (International & Cross-Border)
Band 2

Investment Funds & Asset Management (International & Cross-Border)
Band 2

Life Sciences & Pharmaceutical Sector (International & Cross-Border)
Band 2

Projects
Band 1

Restructuring/Insolvency
Band 1

【讨论】：

【解决方案2】：

而不是写soup.find_all("div", {"class": "mb-3"})使用

soup.find_all("div", class_="mb-3"})

【讨论】：

其实这没什么区别。您可以自行检查 - 搜索仍然是一个空列表。
好的，我会在我的系统上试试。