【发布时间】:2017-09-17 05:31:25
【问题描述】:
此代码用于从网页中搜索电影并打印搜索结果的第一个标题。
from urllib.request import urlopen
import urllib
from bs4 import BeautifulSoup
import requests
import pprint
def infopelicula(nombrepelicula):
my_url='http://www.imdb.com/find?ref_=nv_sr_fn&q='+nombrepelicula+'&s=tt'
rprincipal = requests.get(my_url)
soup= BeautifulSoup(rprincipal.content, 'html.parser')
title = soup.findAll("td", class_="result_text")
for name in title:
titulo = name.parent.find("a", href=True)
print (name.text)[0]
它确实有效,但在打印标题时,出现错误。 举个例子:
>>>infopelicula("Harry Potter Chamber")
Harry Potter and the Chamber of Secrets (2002)
Traceback (most recent call last):File "<pyshell#49>", line 1, in <module>
infopelicula("Harry Potter Chamber")
File "xxxx", line 14, in infopelicula print (name.text)[0]
TypeError: 'NoneType' object is not subscriptable
【问题讨论】:
标签: python web-scraping python-3.5