【发布时间】:2020-03-24 05:33:09
【问题描述】:
我需要一些关于数据抓取任务的帮助:https://soilhealth.dac.gov.in/NewHomePage/NutriPage 我设法填写下拉菜单并使用此代码单击查看:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from bs4 import BeautifulSoup
url = "https://soilhealth.dac.gov.in/NewHomePage/NutriPage"
driver = webdriver.Chrome(executable_path='./chromedriver.exe')
driver.get(url)
select = Select(driver.find_element_by_id('NutriCatId'))
select.select_by_visible_text('Sample Wise')
select = Select(driver.find_element_by_id('CycleId'))
select.select_by_visible_text('All Cycle')
select = Select(driver.find_element_by_id('State_Code'))
select.select_by_visible_text('Andaman And Nicobar Islands')
driver.implicitly_wait(5)
select = Select(driver.find_element_by_id('District_Code'))
select.select_by_visible_text('Nicobars')
driver.find_element_by_id('s').click()
driver.implicitly_wait(30)
soup_level1=BeautifulSoup(driver.page_source, 'lxml')
我需要从源代码中抓取表数据,而不是将其放在soup_level1 xml 中,我只有javascript 代码。 了解是否可以使用 Selenium 抓取数据的任何帮助都是可能的,我该怎么做会很糟糕。 感谢您的帮助。
【问题讨论】:
-
仅供参考,它是 scrape(和 scrape、scraped、scraper)不是报废
标签: javascript python selenium web-scraping