【发布时间】:2021-01-02 11:23:10
【问题描述】:
在我的 python selenium 项目中随机发生错误,我从网站上抓取数据。它获取日期、温度、风和降雨量。脚本有时运行正常,但有时会弹出错误:
selenium.common.exceptions.StaleElementReferenceException:消息: 过时的元素引用:元素未附加到页面文档 (会话信息:chrome=84.0.4147.141)
完整代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
import pandas as pd
from datetime import datetime
import schedule
import time
def job():
url="https://pent.no/60.19401,11.09936"
dates = "forecast-day-view-date-bar__date"
times = "forecast-hour-view-hour-label"
temps = "forecast-hour-view-weather-widget__temperature"
winder = "forecast-hour-view-weather-widget__wind-speed"
rainfalls = "forecast-hour-view-weather-widget__precipitation"
driver = webdriver.Chrome()
driver.get(url)
date = driver.find_elements_by_class_name(dates)
i = 0
for klikk in dates:
date[i].click()
i = i +1
if i==len(date):
break
time = driver.find_elements_by_class_name(times)
temp = driver.find_elements_by_class_name(temps)
temp2 = temp[::2]
temp3 = temp[1::2]
wind = driver.find_elements_by_class_name(winder)
wind2 = wind[::2]
wind3 = wind[1::2]
rainfall = driver.find_elements_by_class_name(rainfalls)
rainfall2 = rainfall[::2]
rainfall3 = rainfall[1::2]
a = []
b = []
c = []
d = []
e = []
f = []
g = []
h = []
#
for datoer in date:
print(datoer.text)
a.append(datoer.text)
a.extend([""]*23)
df1 = pd.DataFrame(a, columns= ["Date"])
print(df1)
#
for tider in time:
print(tider.text)
b.append(tider.text)
df2 = pd.DataFrame(b, columns= ["Time"])
#
for tempyr in temp2:
print(tempyr.text)
c.append(tempyr.text)
df3 = pd.DataFrame(c, columns= ["Temp Yr"])
for tempstorm in temp3:
print(tempstorm.text)
d.append(tempstorm.text)
df4 = pd.DataFrame(d, columns= ["Temp Storm"])
#
for windyr in wind2:
print(windyr.text)
e.append(windyr.text)
df5 = pd.DataFrame(e, columns= ["Wind Yr"])
for windstorm in wind3:
print(windstorm.text)
f.append(windstorm.text)
df6 = pd.DataFrame(f, columns= ["Wind Storm"])
#
for rainfallyr in rainfall2:
g.append(rainfallyr.text)
print(rainfallyr.text)
df7 = pd.DataFrame(g, columns= ["Rainfall Yr"])
df7 = df7.replace(r'^\s*$', "0.0 mm", regex=True)
for rainfallstorm in rainfall3:
h.append(rainfallstorm.text)
print(rainfallstorm.text)
df8 = pd.DataFrame(h, columns= ["Rainfall Storm"])
df8 = df8.replace(r'^\s*$', "0.0 mm", regex=True)
#
tabell = [df1, df2, df3, df4, df5, df6, df7, df8]
result = pd.concat(tabell, axis=1)
result.to_excel("weather" + str(int(datetime.now().day)) + ".xlsx")
driver.quit()
schedule.every().day.at("00:00").do(job)
while 1:
schedule.run_pending()
time.sleep(1)
【问题讨论】:
-
不确定你为什么要循环一个字符串来执行点击 => for klikk in dates, where dates'"forecast-day-view-date-bar__date";
-
selenium.common.exceptions.StaleElementReferenceException当您的元素从 DOM 中删除时发生。这主要发生在页面刷新数据的动态网站上。从您的代码中,它可以在对 WebElement 的任何调用中收集元素后发生。 (.Iclick, text) -
没有“简单”的方法来解决这个问题。我已经在 selenium 之上实现了一个包装器,以避免这种情况并收集新元素。在您的情况下,请尝试使对元素 (
i.e text) 的调用更接近被检索的元素 (find_by..)。 -
@nic 我已经尝试过您的提示,但错误仍然存在。你有关于如何在 python 中实现 selenium 包装器的链接吗?