无法使用 beautifulsoup 获取锚标签答案

【问题标题】：Unable to get the anchor tag using beautifulsoup无法使用 beautifulsoup 获取锚标签
【发布时间】：2020-09-04 23:18:52
【问题描述】：

我想从部分内的锚标记列表中获取名称和链接，但我无法获取。

网址https://www.snopes.com/collections/new-coronavirus-collection/

category=[]
url=[]
for ul in soup.findAll('a',{"class":"collected-list"}):
    if ul is not None:
        category.append(ul.get_text())
    else:
        category.append("")
    links = ul.findAll('a')
    if links is not None:
        for a in links:
            url.append(a['href'])

早些时候，我能够得到列表和 URL，但现在网站结构发生了变化，我的代码不起作用，预期的输出是这样的

【问题讨论】：

标签： html python-3.x beautifulsoup

【解决方案1】：

看起来感兴趣的a 标记现在是collected-item 而不是collected-list（现在是section 类）。您可以搜索类名称为collected-item 的所有a 标签，然后在同一个锚点下找到类title 的h5 标签，以获取标题描述，它似乎包含（通过一些操作）您描述的类别在您的预期输出中。

from bs4 import BeautifulSoup
import requests

source = requests.get('https://www.snopes.com/collections/new-coronavirus-collection/').text
soup = BeautifulSoup(source, 'lxml')

category=[]
url = []

for ul in soup.findAll('a',{"class":"collected-item"}):
    if ul is not None:
        title = ul.find('h5', {"class": "title"}).get_text()
        title_short = title.replace("The Coronavirus Collection: ","")
        category.append(title_short)
        url.append(ul['href'])

for c,u in zip(category, url):
    print(c,u)

Origins and Spread https://www.snopes.com/collections/coronavirus-origins-treatments/?collection-id=238235
Prevention and Treatments https://www.snopes.com/collections/coronavirus-collection-prevention-treatments/?collection-id=238235
Prevention and Treatments II https://www.snopes.com/collections/coronavirus-collection-prevention-treatments-2/?collection-id=238235
International Response https://www.snopes.com/collections/coronavirus-international-rumors/?collection-id=238235
US Government Response https://www.snopes.com/collections/coronavirus-government-role/?collection-id=238235
Trump and the Pandemic https://www.snopes.com/collections/coronavirus-collection-trump/?collection-id=238235
Trump and the Pandemic II https://www.snopes.com/collections/coronavirus-collection-trump-2/?collection-id=238235

【讨论】：