【发布时间】:2020-06-05 08:07:48
【问题描述】:
通过 pika 我从 rabbitmq 获取 url 并尝试为 Scrapy spider 创建新请求
当我通过scrapy crawl spider 启动我的蜘蛛时,蜘蛛只是由于raise DontCloseSpider() 而没有关闭,但不要为蜘蛛创建请求
我的自定义异常:
import pika
from scrapy import signals
from scrapy.http import Request
from scrapy.exceptions import DontCloseSpider
class AddRequestExample:
def __init__(self, stats):
self.stats = stats
@classmethod
def from_crawler(cls, crawler):
s = cls(crawler)
crawler.signals.connect(s.spider_idle, signal=signals.spider_idle)
return s
def spider_idle(self, spider):
connection = pika.BlockingConnection(pika.ConnectionParameters(host='localhost'))
channel = connection.channel()
try:
url = channel.basic_get(queue='hello')[2]
url = url.decode()
crawler.engine.crawl(Request(url), self)
except Exception:
pass
raise DontCloseSpider()
我的蜘蛛:
import scrapy
class QuotesSpider(scrapy.Spider):
name = "spider"
def parse(self, response):
yield {
'url': response.url
}
【问题讨论】:
标签: python scrapy rabbitmq pika