【发布时间】:2014-12-07 01:49:06
【问题描述】:
我想仅在文件存在时应用特定设置和 download_middlewares(代理)。只有当带有代理列表的 .txt 存在时,蜘蛛才应该使用代理运行,否则,它应该在没有代理的情况下运行,使用默认 IP 和端口。
我尝试了以下方法,但对我不起作用:
settings.py
import os.path
if os.path.isfile("../proxies.txt"):
BOT_NAME = 'whatever'
SPIDER_MODULES = ['whatever.spiders']
NEWSPIDER_MODULE = 'whatever.spiders'
RETRY_ENABLED = False
REDIRECT_ENABLED = False
DOWNLOAD_TIMEOUT = 15
COOKIES_ENABLED = False
LOG_ENABLED = True
DOWNLOADER_MIDDLEWARES = {
'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110,
'whatever.middlewares.ProxyMiddleware': 100
}
else:
BOT_NAME = 'whatever'
SPIDER_MODULES = ['whatever.spiders']
NEWSPIDER_MODULE = 'whatever.spiders'
DOWNLOADER_MIDDLEWARES = {
}
有什么解决方法吗?
谢谢你们!
【问题讨论】:
标签: python scrapy settings conditional