如何使用 find_all 读取下一个元素答案

【问题标题】：How to read the next element with find_all如何使用 find_all 读取下一个元素
【发布时间】：2016-11-15 14:18:52
【问题描述】：

首先，如果您看我的帖子，我要感谢您。我发现了很多关于如何使用 BS4 阅读下一个元素的帖子，但它涉及到关键字相关的问题。

这是我的问题：我尝试从 txt.files 中删除数据，而构建 HTML 的方式对于不同的变量有类似的环境。

例如，这里是我要提取的变量之一：

（不关注encode/decode部分）

    number= bs.find_all('span', class_='grid_1 prefix_1 suffix_1 data')[0].get_text().encode('ascii', 'ignore').decode(
    'ascii')

它工作得很好，但现在我要提取的下一个变量出现在 number 之后具有完全相同的 html 构建。所以当我跑步时

Local= bs.find_all('span', class_=''span', class_='grid_1 prefix_1 suffix_1 data')[0].get_text().encode('ascii', 'ignore').decode(
    'ascii')
number= bs.find_all('span', class_='grid_1 prefix_1 suffix_1 data')[0].get_text().encode('ascii', 'ignore').decode(
    'ascii')

它为我提供了两个变量的相同信息。据我所知，BS4 在他第一次遇到插入到 findall 中的元素时就停止了。

阅读 Beautiful Soup 文档后，我尝试使用 find_next 命令获取与第二个元素对应的数据。当我跑步时：

    Local= bs.find_all('span', class_='grid_1 prefix_1 suffix_1 data')[0].find_all_next().encode('ascii', 'ignore').decode(
    'ascii')

我收到以下 Python 错误： AttributeError: 'ResultSet' 对象没有属性

当我尝试单独运行 find_next 命令时：

Local= bs.find_next('span', class_='grid_1 prefix_1 suffix_1 data')[0].encode('ascii', 'ignore').decode(
    'ascii')

我收到以下 Python 错误： TypeError: 'NoneType' 对象没有属性 '__getitem__'

我的问题是“如何正确地将 find_next 命令应用于 find_all？”

【问题讨论】：

标签： python-2.7 beautifulsoup

【解决方案1】：

find_all() 函数返回与给定 class 参数匹配的所有 span 标记：class_='grid_1 prefix_1 suffix_1 data'

因此，没有“下一个”元素可以找到。你已经有了。

尝试循环遍历来自find_all()的结果：

spans = bs.find_all('span', class_='grid_1 prefix_1 suffix_1 data')
for span in spans:
    sub_text = re.sub(r'[\ \n\r]{2,}', '', span.get_text())

【讨论】：

感谢您的回答我很理解为什么会出现这些错误，但是当我运行您提出的代码时，出现以下错误 IndentationError: unexpected indent
检查你的空格格式，如果你剪切和粘贴代码，可能会有空格/制表符的混合。