BeautifulSoup 返回 TypeError：“NoneType”的对象没有 len()答案

【问题标题】：BeautifulSoup returning a TypeError: object of 'NoneType' has no len()BeautifulSoup 返回 TypeError：“NoneType”的对象没有 len()
【发布时间】：2020-11-06 10:25:49
【问题描述】：

我正在使用 BeautifulSoup 抓取数据并向我返回列表中所有 div 的列表，但它给了我这个错误：

Traceback (most recent call last):
  File "C:\Users\intel\Desktop\One page\test.py", line 16, in <module>
    soup = BeautifulSoup(div.html,'html5lib')
  File "C:\Users\intel\AppData\Local\Programs\Python\Python38\lib\site-packages\bs4\__init__.py", line 287, in __init__
    elif len(markup) <= 256 and (
TypeError: object of type 'NoneType' has no len()

这是我的代码：

from bs4 import BeautifulSoup
import requests as req

resp = req.get('https://medium.com/@daranept27')

x = resp.text

soup = BeautifulSoup(x, "lxml")
 
divs = soup.find_all("div")
#print(divs)

lst = []

for div in divs:
    soup = BeautifulSoup(div.html,'html5lib')
    div_tag = soup.find()
    try:
        title = div_tag.section.div.h1.a['href']
        if title not in lst: lst.append(title)
    except:
        pass

print("\n".join(lst))

【问题讨论】：

标签： python python-3.x web-scraping beautifulsoup python-requests

【解决方案1】：

尝试使用str(div) 将div 转换为str。完整代码如下：

from bs4 import BeautifulSoup
import requests as req

resp = req.get('https://medium.com/@daranept27')

x = resp.text

soup = BeautifulSoup(x, "lxml")

divs = soup.find_all("div")
# print(divs)

lst = []

for div in divs:
    soup = BeautifulSoup(str(div), 'html5lib')
    div_tag = soup.find()
    try:
        title = div_tag.section.div.h1.a['href']
        if title not in lst: lst.append(title)
    except:
        pass

print("\n".join(lst))

输出：

/read-rosy/if-the-whole-world-is-compelled-to-forget-everything-cde200c0ad98
/wordsmith-library/seven-days-between-life-and-death-dffb639fb245
/an-idea/have-you-ever-encountered-a-fake-friend-if-so-try-these-simple-tips-to-overcome-it-d8473d755ab8

【讨论】：