【问题标题】:what's the scrapy log mean什么是scrapy日志是什么意思
【发布时间】:2016-01-07 06:14:38
【问题描述】:

例如

2016-01-07 11:37:19 [scrapy] INFO: Crawled 61 pages (at 61 pages/min), scraped 0 items (at 0 items/min)
2016-01-07 11:38:19 [scrapy] INFO: Crawled 171 pages (at 110 pages/min), scraped 0 items (at 0 items/min)
2016-01-07 11:39:19 [scrapy] INFO: Crawled 299 pages (at 128 pages/min), scraped 0 items (at 0 items/min)
2016-01-07 11:40:19 [scrapy] INFO: Crawled 394 pages (at 95 pages/min), scraped 0 items (at 0 items/min)
2016-01-07 11:41:19 [scrapy] INFO: Crawled 487 pages (at 93 pages/min), scraped 0 items (at 0 items/min)
2016-01-07 11:42:19 [scrapy] INFO: Crawled 554 pages (at 67 pages/min), scraped 0 items (at 0 items/min)
2016-01-07 11:43:19 [scrapy] INFO: Crawled 616 pages (at 62 pages/min), scraped 0 items (at 0 items/min)
2016-01-07 11:44:19 [scrapy] INFO: Crawled 743 pages (at 127 pages/min), scraped 0 items (at 0 items/min)
  1. “抓取”、“抓取”这个词是什么意思?
  2. 什么时候scrapy会打印log,比如“爬了743页(127页/分钟),刮了0条(0条/分钟)”,那个时候调用哪个函数?

【问题讨论】:

标签: python scrapy


【解决方案1】:
  1. 抓取的页面是您的蜘蛛之一请求的页面。它也应该解析它,具体取决于您如何对其进行编程。抓取的项目是从该解析中提取的一组数据。两者都在scrapy教程中进行了解释:itemsspiders

  2. 我不太确定,但如果我没记错的话,它会在蜘蛛完成其工作时打印出来。

【讨论】: