【发布时间】:2016-01-28 21:09:14
【问题描述】:
我做了一个爬虫,splash 可以正常工作(我在浏览器中测试过),scrapy 虽然不能爬取和提取项目。
我的实际代码是:
# -*- coding: utf-8 -*-
import scrapy
import json
from scrapy.http.headers import Headers
from scrapy.spiders import CrawlSpider, Rule
from oddsportal.items import OddsportalItem
class OddbotSpider(CrawlSpider):
name = "oddbot"
allowed_domains = ["oddsportal.com"]
start_urls = (
'http://www.oddsportal.com/matches/tennis/',
)
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(url, self.parse, meta={
'splash': {
'endpoint': 'render.html',
'args': {'wait': 5.5}
}
})
def parse(self, response):
item = OddsportalItem()
print response.body
【问题讨论】:
-
response.body的输出是什么? -
print response.body? -
它什么也没打印:我用实际代码编辑了
标签: python scrapy web-crawler splash-screen