【发布时间】:2021-08-21 02:49:51
【问题描述】:
我有一个链接数组,我试图访问每个链接并从中打印一些内容,然后返回主页并访问第二个链接,然后执行相同操作,直到完成数组中的所有链接。
发生的情况是第一个链接是唯一有效的链接,就像数组中的所有链接都消失了一样。我得到错误:
File "e:\work\MY CODE\scraping\learn.py", line 25, in theprint link.click()
from selenium import webdriver
from selenium.webdriver.common import keys
#it make us able to use keybored keys like enter ,esc , etc....
from selenium.webdriver.common.keys import Keys
import time
#make us can wait for event to happen until run the next line of code
from selenium.webdriver.common.by import By
from selenium.webdriver.remote import command
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
#get the google chrome driver path
PATH="E:\work\crom\chromedriver.exe"
#pass the pass to selenium webdriver method
driver=webdriver.Chrome(PATH)
#get the link of the site we want
driver.get("https://app.dealroom.co/companies.startups/f/client_focus/anyof_business/company_status/not_closed/company_type/not_government%20nonprofit/employees/anyof_2-10_11-50_51-200/has_website_url/anyof_yes/slug_locations/anyof_france?sort=-revenue")
#wait for the page to load
time.sleep(5)
#get the links i want to get info from
the_links=driver.find_elements_by_class_name("table-list-item")
#function that go the link and print somethin and return to main page
links=[]
the_links=driver.find_elements_by_class_name("table-list-item")
for link in the_links:
links.append(link.get_attribute('href'))
for link in links:
driver.get(link)
website=driver.find_element_by_class_name("item-details-info__url")
print(website.text)
driver.back()
time.sleep(3)
【问题讨论】:
-
您是否获得过时的元素引用?您不能定义一个元素,切换页面,然后再次使用该元素。看起来这就是您正在尝试做的事情,这会导致过时元素错误。'
-
是的,我知道了,你能告诉我另一种方法吗??
标签: python selenium loops web-scraping href