【问题标题】:AttributeError: 'module' object has no attribute 'DATABASE' when using scrapy shellAttributeError:'module'对象在使用scrapy shell时没有属性'DATABASE'
【发布时间】:2015-12-13 23:30:24
【问题描述】:

我试图在我的项目的根目录中运行 scrapy shell,但我不断收到关于某种 DATABASE 设置的模糊错误。我不确定这是否是 SQLAlchemy 的事情……还是我的架构定义有问题?

如果我从项目路径之外的任何其他目录运行scrapy shell http://some_website.com,我没有问题。

尝试启动外壳:

me@me:~/my_spider$ scrapy shell http://some_website.com
2015-12-13 15:15:58-0800 [scrapy] INFO: Scrapy 0.24.4 started (bot: my_bot)
2015-12-13 15:15:58-0800 [scrapy] INFO: Optional features available: ssl, http11, boto, django
2015-12-13 15:15:58-0800 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'my_spider.spiders', 'DEPTH_LIMIT': 2, 'CONCURRENT_REQUESTS_PER_DOMAIN': 1, 'CONCURRENT_REQUESTS': 1, 'SPIDER_MODULES': [''my_spider.spiders'], 'BOT_NAME': 'my_bot', 'COOKIES_ENABLED': False, 'LOGSTATS_INTERVAL': 0, 'DOWNLOAD_DELAY': 5}
2015-12-13 15:15:58-0800 [scrapy] INFO: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
2015-12-13 15:15:59-0800 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, RandomUserAgentMiddleware, ProxyMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, ChunkedTransferMiddleware, DownloaderStats
2015-12-13 15:15:59-0800 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware

这是回溯:

Traceback (most recent call last):
  File "/usr/local/bin/scrapy", line 11, in <module>
    sys.exit(execute())
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command
    cmd.run(args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/shell.py", line 46, in run
    self.crawler_process.start_crawling()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 124, in start_crawling
    return self._start_crawler() is not None
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 139, in _start_crawler
    crawler.configure()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 47, in configure
    self.engine = ExecutionEngine(self, self._spider_closed)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 65, in __init__
    self.scraper = Scraper(crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/scraper.py", line 66, in __init__
    self.itemproc = itemproc_cls.from_crawler(crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 50, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 35, in from_settings
    mw = mwcls()
  File "~/my_spider/pipelines.py", line 14, in __init__
    engine = db_connect()
  File "~/my_spider/libs/database/__init__.py", line 14, in db_connect
    url = URL(**settings.DATABASE)
AttributeError: 'module' object has no attribute 'DATABASE'

任何建议将不胜感激。

【问题讨论】:

    标签: python scrapy scrapy-shell


    【解决方案1】:

    除了@tristan 指出的内容之外,根据回溯 - 当您启动 shell 时,Scrapy 会选择您的项目设置,其中包括管道,其中一个是执行 db_connect() 函数,该函数使用 @ 987654322@设置:

    url = URL(**settings.DATABASE)
    

    确保您在项目设置中定义了 DATABASE 字典。

    【讨论】:

    • 这确实是我的问题。谢谢!
    【解决方案2】:

    你有一个scrapy正在寻找的settings变量定义,而不是它认为它会找到(或需要)的那个。

    它不是查找scrapy/middleware.py 模块from_settings() 调用正在使用的settings 对象,而是查找您的设置对象并寻找它以提供.DATABASE 属性。没有看到你在my_bot 中的代码、TraceBack 和最新 Python 2.7 scrapy 中的代码行:

     26     @classmethod
     27     def from_settings(cls, settings, crawler=None):
     28         mwlist = cls._get_mwlist_from_settings(settings)
     29         middlewares = []
     30         for clspath in mwlist:
     31             try:
     32                 mwcls = load_object(clspath)
     33                 if crawler and hasattr(mwcls, 'from_crawler'):
     34                     mw = mwcls.from_crawler(crawler)
     35                 elif hasattr(mwcls, 'from_settings'):
     36                     mw = mwcls.from_settings(settings)
    

    建议对您不想要的设置对象的方法解析,或者您按照教程提供了from_settings(),但没有实现所需的属性。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-01-15
      • 2015-05-16
      • 1970-01-01
      • 2022-12-20
      • 2018-01-14
      • 2014-08-15
      相关资源
      最近更新 更多