【发布时间】:2019-01-25 18:15:08
【问题描述】:
我想在scrapy fpr示例中提取xpath标签的值我有这个
/html/body/div[3]/ul[1]/li[1]/div/p
q1
/html/body/div[3]/ul[1]/li[3]/div/p
ans1
/html/body/div[3]/ul[2]/li[1]/div/p
q2
/html/body/div[3]/ul[2]/li[2]/div/p
ans2 链接:https://www.digikala.com/ajax/product/questions/980291
在这样的产量中
def parse(self, response):
for quote in response.xpath('//html/body/main'):
yield {
#question or answer
#question pattern li/div/p or li[1]/div/p
#answer pattern ended with li[2 or higher number]/div/p
#related question and answer both have the same ul for example both are ul[1]
'type': quote.xpath('i dont know this part').extract_first (),
'QAnumber': quote.xpath('?').extract(),
'text': quote.xpath('/html/body/div[3]/*/*/div/p/text()').extract(),
}
我如何提取这 3 个部分
【问题讨论】:
标签: python-2.7 web-scraping scrapy