【发布时间】:2020-09-19 19:02:47
【问题描述】:
我正在使用 Python 和 Selenium 抓取这个网站(http://rera.rajasthan.gov.in/ProjectSearch)。我有代码工作,但它目前只刮第一页,我想遍历所有页面并刮掉其中存在的所有 VIEW,但它们以一种奇怪的方式处理分页我将如何浏览页面并刮掉它们一个接一个?
我的源代码:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException, WebDriverException
import time
opt = webdriver.ChromeOptions()
opt.add_argument("--ignore-certificate-errors")
opt.add_argument("--start-maximized")
driver = webdriver.Chrome(executable_path=r"C:\Users\fit foodie\PycharmProjects\Selenium\Browser\chromedriver.exe", options=opt)
driver.get(url="http://rera.rajasthan.gov.in/")
search= driver.find_element_by_xpath("//*[@id='liSearch']/a").click()
proj_src=driver.find_element_by_xpath("//*[@id='liSearch']/ul/li[1]/a").click()
search_btn = driver.find_element_by_xpath('//*[@id="btn_SearchProjectSubmit"]').click()
def page():
while True:
try:
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//*[@id='OuterProjectGrid']/div[4]/div[4]/a"))))
driver.find_element_by_xpath("//*[@id='OuterProjectGrid']/div[4]/div[4]/a").click()
print("Navigating to Next Page")
except (TimeoutException, WebDriverException) as e:
print("Last page reached")
break
无法分页
【问题讨论】:
标签: python selenium xpath pagination webdriverwait