【发布时间】:2020-06-19 02:21:25
【问题描述】:
我想获取具有相同类名的文章的文章名称和网址。 问题是,它一次又一次地只打印一个信息,而不是所有的文章。
from selenium import webdriver
driver = webdriver.Chrome(r'C:\Users\muhammad.usman\Downloads\chromedriver_win32\chromedriver.exe')
driver.get('https://www.aljazeera.com/news/')
# to get the current location ...
driver.current_url
button = driver.find_element_by_id('btn_showmore_b1_418')
driver.execute_script("arguments[0].click();", button)
content = driver.find_element_by_class_name('topics-sec-block')
print(content)
container = content.find_elements_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]')
print(container)
i=0
for i in range(0, 12):
title = []
url = []
heading=container[i].find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a/h2').text
link = container[i].find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a')
title.append(heading)
url.append(link.get_attribute('href'))
print(title)
print(url)
i += 1
names = driver.find_elements_by_css_selector('div.topics-sec-item-cont')
for name in names:
heading=name.find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a/h2').text
link = name.find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a')
print(heading)
print(link.get_attribute('href'))
【问题讨论】:
标签: python selenium selenium-webdriver web-scraping web-crawler