【发布时间】:2014-04-16 23:13:20
【问题描述】:
我正在尝试根据 scrapy 文档从 python 脚本运行蜘蛛:http://doc.scrapy.org/en/latest/topics/practices.html
from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy import log, signals
from testspiders.spiders.followall import FollowAllSpider
from scrapy.utils.project import get_project_settings
spider = FollowAllSpider(domain='scrapinghub.com')
settings = get_project_settings()
crawler = Crawler(settings)
crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
reactor.run() # the script will block here until the spider_closed signal was sent
但是python就是无法导入模块,报错如下:
Traceback (most recent call last):
...
from scrapy.crawler import Crawler
File "aappp/scrapy.py", line 1, in <module>
ImportError: No module named crawler
scrapy文档的faq中简要提到了这个问题,但对我没有太大帮助。
【问题讨论】: