【发布时间】:2018-11-01 07:30:35
【问题描述】:
我想通过 Selenium 抓取一个网站,总共有 10 页。我的代码如下,但为什么我只能得到首页结果:
# -*- coding: utf-8 -*-
from selenium import webdriver
from scrapy.selector import Selector
MAX_PAGE_NUM = 10
MAX_PAGE_DIG = 3
driver = webdriver.Chrome('C:\Users\zhang\Downloads\chromedriver_win32\chromedriver.exe')
with open('results.csv', 'w') as f:
f.write("Buyer, Price \n")
for i in range(1, MAX_PAGE_NUM + 1):
page_num = (MAX_PAGE_DIG - len(str(i))) * "0" + str(i)
url = "https://www.oilandgasnewsworldwide.com/Directory1/DREQ/Drilling_Equipment_Suppliers_?page=" + page_num
driver.get(url)
names = sel.xpath('//*[@class="fontsubsection nomarginpadding lmargin opensans"]/text()').extract()
Countries = sel.xpath('//td[text()="Country:"]/following-sibling::td/text()').extract()
websites = sel.xpath('//td[text()="Website:"]/following-sibling::td/a/@href').extract()
driver.close()
print(len(names), len(Countries), len(websites))
【问题讨论】: