【问题标题】:Can I run Scrapy on Windows with Python 3?我可以使用 Python 3 在 Windows 上运行 Scrapy 吗?
【发布时间】:2023-12-26 14:59:01
【问题描述】:

似乎 Scrapy 1.1.0rc3 不适用于带有 Python 3 的 Windows。

当我运行在Scrapy tutorial 上指定的scrapy crawl dmoz 命令时,出现以下异常:

D:\Copy From 2\Python Project\ZhihuPlan\tutorial\tutorial>scrapy crawl dmoz
2016-04-26 14:40:36 [scrapy] INFO: Scrapy 1.1.0rc3 started (bot: tutorial)
2016-04-26 14:40:36 [scrapy] INFO: Overridden settings: {'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial', 'NEWSPIDER_MODULE': 'tutorial.spiders'}
2016-04-26 14:40:36 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.logstats.LogStats']
Unhandled error in Deferred:
2016-04-26 14:40:36 [twisted] CRITICAL: Unhandled error in Deferred:

Traceback (most recent call last):  
File "D:\Anaconda\Lib\site-packages\scrapy\commands\crawl.py", line 57, in run self.crawler_process.crawl(spname, **opts.spargs)  
File "D:\Anaconda\Lib\site-packages\scrapy\crawler.py", line 163, in crawl return self._crawl(crawler, *args, **kwargs)  
File "D:\Anaconda\Lib\site-packages\scrapy\crawler.py", line 167, in _crawl
    d = crawler.crawl(*args, **kwargs)  
File "D:\Anaconda\Lib\site-packages\twisted\internet\defer.py", line 1274, in unwindGenerator
    return _inlineCallbacks(None, gen, Deferred())
--- <exception caught here> ---  
  File "D:\Anaconda\Lib\site-packages\twisted\internet\defer.py", line 1128, in _inlineCallbacks
    result = g.send(result)  
  File "D:\Anaconda\Lib\site-packages\scrapy\crawler.py", line 72, in crawl
    self.engine = self._create_engine()  
  File "D:\Anaconda\Lib\site-packages\scrapy\crawler.py", line 97, in _create_engine  
    return ExecutionEngine(self, lambda _: self.stop())
  File "D:\Anaconda\Lib\site-packages\scrapy\core\engine.py", line 68, in __init__
    self.downloader = downloader_cls(crawler)  
  File "D:\Anaconda\Lib\site-packages\scrapy\core\downloader\__init__.py", line 88, in __init__
    self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)  
  File "D:\Anaconda\Lib\site-packages\scrapy\middleware.py", line 58, in from_crawler
    return cls.from_settings(crawler.settings, crawler)  
  File "D:\Anaconda\Lib\site-packages\scrapy\middleware.py", line 34, in from_settings
    mwcls = load_object(clspath)  
  File "D:\Anaconda\Lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
    mod = import_module(module)  
  File "d:\anaconda\lib\importlib\__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)  
  File "<frozen importlib._bootstrap>", line 986, in _gcd_import

 File "<frozen importlib._bootstrap>", line 969, in _find_and_load

  File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked

  File "<frozen importlib._bootstrap>", line 673, in _load_unlocked

  File "<frozen importlib._bootstrap_external>", line 662, in exec_module

  File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed

  File "D:\Anaconda\Lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 23, in <module>
    from scrapy.xlib.tx import ResponseFailed
  File "D:\Anaconda\Lib\site-packages\scrapy\xlib\tx\__init__.py", line 3, in <module>
    from twisted.web import client
  File "D:\Anaconda\Lib\site-packages\twisted\web\client.py", line 41, in <module>
    from twisted.internet.endpoints import TCP4ClientEndpoint, SSL4ClientEndpoint
  File "D:\Anaconda\Lib\site-packages\twisted\internet\endpoints.py", line 34, in <module>
    from twisted.internet.stdio import StandardIO, PipeAddress
  File "D:\Anaconda\Lib\site-packages\twisted\internet\stdio.py", line 30, in <module>
    from twisted.internet import _win32stdio
builtins.ImportError: cannot import name '_win32stdio'
2016-04-26 14:40:36 [twisted] CRITICAL:

有什么办法可以解决这个错误吗?我可以使用 Python 3 在 Windows 上运行 Scrapy 吗?

【问题讨论】:

标签: python scrapy web-crawler twisted pywin32


【解决方案1】:

很遗憾,您无法在 Windows 上使用 Python 3 运行 Scrapy。

Scrapy 目前不支持 Windows 上的 Python 3。请参阅此处的发行说明(滚动到限制部分)https://blog.scrapinghub.com/2016/02/04/python-3-support-with-scrapy-1-1rc1/ 我们正在努力解决此问题。

与此同时,您可以尝试使用不同或不同的 Python 版本。

【讨论】:

    【解决方案2】:

    Scrapy 现在可以在带有 Python 3 的 Windows 10 上运行。我刚刚通过 conda 提示符进行了尝试。

    首先,打开 conda 并激活你添加了 scrapy 的环境。

    conda activate <your environment>
    

    然后运行scrapy

    scrapy runspider <path.to.file.py>
    

    ...scrapy 应该运行。

    【讨论】:

      【解决方案3】:

      Scrapy 在系统 Windows 上与 Python 3 一起工作。

      安装 Anaconda 或 Miniconda 后,使用以下命令安装 Scrapy:

      conda install -c conda-forge scrapy
      
      scrapy crawl spider_name
      

      在这里找到安装过程: https://doc.scrapy.org/en/1.4/intro/install.html

      【讨论】:

        最近更新 更多