【发布时间】:2020-05-01 17:59:04
【问题描述】:
尝试执行此代码以抓取下面提到的特定网站/RSS 提要 继续获得:
Traceback(最近一次调用最后一次):
文件“C:\Users\Jeanne\Desktop\PYPDIT\pyscape.py”,第 28 行,在 成绩单 = [url_to_transcript(u) for u in urls]
文件“C:\Users\Jeanne\Desktop\PYPDIT\pyscape.py”,第 28 行,在 成绩单 = [url_to_transcript(u) for u in urls]
文件“C:\Users\Jeanne\Desktop\PYPDIT\pyscape.py”,第 17 行,在 url_to_transcript text = [p.text for p in soup.find(class_="itemcontent").find_all('p')]
AttributeError: 'NoneType' 对象没有属性 'find_all'
请指教。
import requests
from bs4 import BeautifulSoup
import pickle
def url_to_transcript(url):
page = requests.get(url).text
soup = BeautifulSoup(page, "lxml")
text = [p.text for p in soup.find(class_="itemcontent").find_all('p')]
print(url)
return text
范围内成绩单的 URL
urls = ['http://feeds.nos.nl/nosnieuwstech',
'http://feeds.nos.nl/nosnieuwsalgemeen']
transcripts = [url_to_transcript(u) for u in urls]
【问题讨论】:
标签: web-scraping beautifulsoup nonetype