BeautifulSoup find_all 的问题答案

【问题标题】：Problems with BeautifulSoup find_allBeautifulSoup find_all 的问题
【发布时间】：2020-12-16 02:31:58
【问题描述】：

我需要从站点 html 中检索一些 id，如果我创建一些变量将它们存储在那里，这不是一项艰巨的工作，但是我想使用一个列表来使其更容易找到和使用。

使用以下行时，终端返回“TypeError: list indices must be integers or slices, not str”：

ids = site.find_all('p', class_="frase fr")['id']

我的意思是，使用soup.find_all 对我来说很好，但如果我使用最后的方括号来指定它应该在哪里收集信息，它就不起作用。问题就在这里，我该如何解决？

【问题讨论】：

请分享您已经尝试过的完整代码
``` from bs4 import BeautifulSoup import requests import wget import webbrowser site = requests.get('pensador.com/').content site = BeautifulSoup(site, 'html.parser') ids = site.find_all(' p', class_="frase fr")['id'] print(ids) ```到此为止，今天开始这个项目。

【解决方案1】：

find_all 方法返回一个元素列表，因此如果您只想获取每个元素的 ID，则必须遍历每个元素并提取所需的信息。

改用这个：

ids = [p.get('id') for p in site.find_all('p', class_="frase fr")]

这将为您提供您找到的标签中每个 ID 的列表，包括 None 的。

您还可以使用以下方法过滤None：

ids = [p.get('id') for p in site.find_all('p', class_="frase fr") if p.get('id')]

【讨论】：