循环遍历html标签答案

【问题标题】：Loop through html tags循环遍历html标签
【发布时间】：2020-05-01 13:42:12
【问题描述】：

我正在尝试使用 html 解析器和 beautifulsoup 抓取网页。我正在尝试从某些

中获取文本

标签。但是由于其中一些根本没有文本，所以对于那些为空的，我会得到一个属性错误。我正在尝试以下代码：

content = elements.find("p",{"class":"Text"}).text #Where elements is a bs4 tag inside a for loop

经过一些迭代，我得到以下错误：

AttributeError: 'NoneType' object has no attribute 'text'

也许我将不得不尝试以下方法：

while True:
    content = elements.find("p",{"class":"Text"}).text
    if type(content)==None:
        content = 'None'

但是上面的代码有问题

【问题讨论】：

标签： python web-scraping beautifulsoup html-parsing

【解决方案1】：

在访问元素的text 属性之前，您必须检查该元素是否不是None。

while True:
    elem = elements.find("p",{"class":"Text"})
    if elem is not None:
        content = elem.text
    else:
        content = 'None'  # Any static value you want to give

【讨论】：