【问题标题】:cannot import scrapy modules as library无法将scrapy模块作为库导入
【发布时间】:2014-04-16 23:13:20
【问题描述】:

我正在尝试根据 scrapy 文档从 python 脚本运行蜘蛛:http://doc.scrapy.org/en/latest/topics/practices.html

from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy import log, signals
from testspiders.spiders.followall import FollowAllSpider
from scrapy.utils.project import get_project_settings

spider = FollowAllSpider(domain='scrapinghub.com')
settings = get_project_settings()
crawler = Crawler(settings)
crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
reactor.run() # the script will block here until the spider_closed signal was sent

但是python就是无法导入模块,报错如下:

Traceback (most recent call last):
...
    from scrapy.crawler import Crawler
  File "aappp/scrapy.py", line 1, in <module>
ImportError: No module named crawler

scrapy文档的faq中简要提到了这个问题,但对我没有太大帮助。

【问题讨论】:

    标签: python scrapy


    【解决方案1】:

    你试过这样做吗?

    from scrapy.project import crawler
    

    (在http://doc.scrapy.org/en/latest/faq.html 上就是这样做的——看起来他们已经在那里回答了你的问题。)

    它还提供了一种更新的方法,并弃用了以前的方法:

    "这种访问爬虫对象的方式已弃用,代码应移植为使用from_crawler类方法,例如:

    类 SomeExtension(object):

    @classmethod
    def from_crawler(cls, crawler):
        o = cls()
        o.crawler = crawler
        return o
    

    "

    【讨论】:

    • 它不起作用:在[1]中:从scrapy.project导入爬虫------------------------ -------------------------------------------------- ImportError Traceback (最近一次调用最后一次) /home/ubuntu/src/ in () ----> 1 from scrapy.project import crawler ImportError: cannot import name crawler In [ 2]:导入scrapy In [3]:scrapy.__version__ Out[3]:'0.18.4'
    猜你喜欢
    • 2020-09-07
    • 1970-01-01
    • 2012-08-26
    • 1970-01-01
    • 2018-04-19
    • 2020-04-23
    • 2020-08-17
    • 1970-01-01
    • 2019-09-24
    相关资源
    最近更新 更多