【问题标题】:test cases for scrapy spider. pythonscrapy spider 的测试用例。 Python
【发布时间】:2021-10-29 07:11:03
【问题描述】:

有没有一种方法可以为我编写的scrapy spider 编写测试用例,并给出输出 例如:[ {"company_name": "A + Communications and Security", "source_url": "/company/a--communications-and-security"}, {"company_name": "A&A Technology Group", "source_url": "/company/a-a-technology-group"}]

import scrapy

class CompanySpider(scrapy.Spider):
    name = 'company'
    start_urls = ['https://www.adapt.io/directory/industry/telecommunications/A-1']
    custom_settings = {"TELNETCONSOLE_ENABLED" : False,"ROBOTSTXT_OBEY" : False}
    
    def parse(self,response):
        for company in response.xpath("//div[contains(@class,'DirectoryList_link')]"):
            yield{
                'company_name' : company.xpath("./a/text()").get(),
                'source_url' : company.xpath("./a/@href").get().split('https://www.adapt.io')[-1]
            }

【问题讨论】:

    标签: python scrapy


    【解决方案1】:

    您可以尝试使用Spiders Contracts 或本地虚假回复,以反映您映射时网站的状态。

    How to work with the scrapy contracts?

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2016-03-17
      • 2016-12-14
      • 1970-01-01
      • 1970-01-01
      • 2019-01-05
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多