Nonetype错误/没有使用beautifulsoup for python打印的元素答案

【问题标题】：Nonetype error/ No elements printed using beautifulsoup for pythonNonetype错误/没有使用beautifulsoup for python打印的元素
【发布时间】：2021-03-10 21:59:33
【问题描述】：

所以我尝试使用 python 比较 2 个列表，其中一个包含我从网站获取的 1000 个链接。另一个包含一些单词，可能包含在第一个列表中的链接中。如果是这种情况，我想得到一个输出。我打印了第一个列表，它确实有效。例如，如果链接是“https://steamcdn-a.swap.gg/apps/730/icons/econ/stickers/eslkatowice2015/counterlogic.f49adabd6052a558bff3fe09f5a09e0675737936.png”并且我的列表包含单词“eslkatowice2015”，我想使用print() 函数获得输出。我的代码如下所示：

page_source = driver.page_source

soup = BeautifulSoup(page_source, 'lxml')
Bot_Stickers = soup.find_all('img', class_='csi')

for sticker in Bot_Stickers:

    for i in StickerIDs:

        if i in sticker:

            print("found")
driver.close()

现在的问题是我没有得到输出，这是不可能的，因为如果我手动比较列表，显然第一个列表中的元素存在于第二个列表（带有链接的那个）中。尝试修复时，我总是遇到 NoneType 错误。 driver.page_source 上面是由我用来访问站点并单击一些 javascript 内容的一些 selenium 定义的，以便能够找到所有内容。我希望它或多或少清楚我想要达到的目标

编辑：StickerIDs 变量是包含我要检查的单词的第二个列表

【问题讨论】：

标签： python selenium selenium-webdriver beautifulsoup

【解决方案1】：

NoneType 错误意味着您可能是getting a None somewhere，因此检查find_all 返回的结果可能更安全，因为None。

使用 BeautifulSoup 已经有一段时间了，但如果我没记错的话，find_all 返回与搜索条件匹配的beautiful soup tags 列表，而不是 URL。在检查标签是否包含关键字之前，您需要从标签中获取href 属性。

类似的东西：

page_source = driver.page_source

soup = BeautifulSoup(page_source, 'lxml')
Bot_Stickers = soup.find_all('img', class_='csi')

if Bot_Stickers and StickersIDs:
 
    for sticker in Bot_Stickers:
        for i in StickerIDs:
            if i in sticker.get("href"): # get href attribute of the img tag
                print("found")
else:
    print("Bot_Stickers:", Bot_Stickers)
    print("StickersIDs:" StickersIDs)

driver.close()

【讨论】：

现在用其他脚本尝试过，就像我之前尝试过的那样，我也收到以下错误： Traceback (most recent call last): File "C:\Users\timjo\Desktop\BOTSCRIPT\ Bot.py", line 27, in if i in sticker.get("href"): TypeError: argument of type 'NoneType' is not iterable not sure how to solve that
@Timeler 错误表示试图迭代无，因此在 for 循环之前检查 Bot_Stickers 和 StickersIDs。检查可能破坏您的代码的值是一种良好且安全的做法。一旦你知道哪个是None，你就应该开始调试为什么你会得到一个None，而不是一个标签列表。我已经更新了答案。
如果我在您编辑解决方案时尝试它，我会得到与以前相同的错误。如果我替换sticker.get（“href”）：通过贴纸我没有错误但也没有输出，所以无论哪种方式它都不会做它应该做的事情
好的，这很有道理。这意味着<img> 标签没有名为href 的属性。您应该检查 HTML 源代码，并查看具有图像链接的属性名称是什么，它可能类似于 src。之后，将href更改为包含图片链接的属性名称，它应该可以工作。