【发布时间】:2013-12-21 20:36:10
【问题描述】:
这是我的代码:
import os
import sys
import time
from urllib import FancyURLopener
import urllib2
import simplejson
# Define search term
searchTerm = "parrot"
# Replace spaces ' ' in search term for '%20' in order to comply with request
searchTerm = searchTerm.replace(' ','%20')
# Start FancyURLopener with defined version
class MyOpener(FancyURLopener):
version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11)Gecko/20071127 Firefox/2.0.0.11'
myopener = MyOpener()
# Set count to 0
count= 0
for i in range(0,10):
# Notice that the start changes for each iteration in order to request a new set of images for each loop
url = ('https://ajax.googleapis.com/ajax/services/search/images?' + 'v=1.0&q='+searchTerm+'&start='+str(i*10)+'&userip=MyIP')
print url
request = urllib2.Request(url, None, {'Referer': 'testing'})
response = urllib2.urlopen(request)
# Get results using JSON
results = simplejson.load(response)
data = results['responseData']
dataInfo = data['results']
# Iterate for each result and get unescaped url
for myUrl in dataInfo:
count = count + 1
my_url = myUrl['unescapedUrl']
myopener.retrieve(myUrl['unescapedUrl'],str(count)+'.jpg')
但是在下载了一些图片后,我得到了以下错误:
Traceback (most recent call last): File "C:\Python27\img_google3.py", line 37, in dataInfo = data['results'] TypeError: 'NoneType' object has no attribute 'getitem'
这可能是什么原因造成的?
我必须从 Google 下载图像,作为训练神经网络进行图像分类的一部分。
【问题讨论】:
-
此外,我必须在系统中运行它才能下载至少 2000 张图像。所以,如果我在几次迭代后得到一个错误,这对我不利。我还有一些疑问,我会在课程中询问。请帮我 。谢谢。
标签: python web web-scraping