【问题标题】:Python selenium error randomly occours: element is not attached to the page documentPython selenium 错误随机发生:元素未附加到页面文档
【发布时间】:2021-01-02 11:23:10
【问题描述】:

在我的 python selenium 项目中随机发生错误,我从网站上抓取数据。它获取日期、温度、风和降雨量。脚本有时运行正常,但有时会弹出错误:

selenium.common.exceptions.StaleElementReferenceException:消息: 过时的元素引用:元素未附加到页面文档 (会话信息:chrome=84.0.4147.141)

完整代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
import pandas as pd
from datetime import datetime
import schedule
import time

def job():
    url="https://pent.no/60.19401,11.09936"

    dates = "forecast-day-view-date-bar__date"
    times = "forecast-hour-view-hour-label"
    temps = "forecast-hour-view-weather-widget__temperature"
    winder = "forecast-hour-view-weather-widget__wind-speed"
    rainfalls = "forecast-hour-view-weather-widget__precipitation"

    driver = webdriver.Chrome()
    driver.get(url)

    date = driver.find_elements_by_class_name(dates)
    i = 0
    
    for klikk in dates:
        date[i].click()
        i = i +1
        if i==len(date):
            break

    time = driver.find_elements_by_class_name(times)

    temp = driver.find_elements_by_class_name(temps)
    temp2 = temp[::2]
    temp3 = temp[1::2]

    wind = driver.find_elements_by_class_name(winder)
    wind2 = wind[::2]
    wind3 = wind[1::2]

    rainfall = driver.find_elements_by_class_name(rainfalls)
    rainfall2 = rainfall[::2]
    rainfall3 = rainfall[1::2]

    a = []
    b = []
    c = []
    d = []
    e = []
    f = []
    g = []
    h = []

    #    
    for datoer in date:
        print(datoer.text)
        a.append(datoer.text)
        a.extend([""]*23)

    df1 = pd.DataFrame(a, columns= ["Date"])
    print(df1)
        
    #
    for tider in time:
        print(tider.text)
        b.append(tider.text)
        
    df2 = pd.DataFrame(b, columns= ["Time"])
    #  
    for tempyr in temp2:
        print(tempyr.text)
        c.append(tempyr.text)
        
    df3 = pd.DataFrame(c, columns= ["Temp Yr"])

    for tempstorm in temp3:
        print(tempstorm.text)
        d.append(tempstorm.text)
        
    df4 = pd.DataFrame(d, columns= ["Temp Storm"])
    #   
    for windyr in wind2:
        print(windyr.text)
        e.append(windyr.text)
        
    df5 = pd.DataFrame(e, columns= ["Wind Yr"])

    for windstorm in wind3:
        print(windstorm.text)
        f.append(windstorm.text)
        
    df6 = pd.DataFrame(f, columns= ["Wind Storm"])
    #   
    for rainfallyr in rainfall2:
        g.append(rainfallyr.text)
        print(rainfallyr.text)
        
    df7 = pd.DataFrame(g, columns= ["Rainfall Yr"])
    df7 = df7.replace(r'^\s*$', "0.0 mm", regex=True)
      
    for rainfallstorm in rainfall3:
        h.append(rainfallstorm.text)
        print(rainfallstorm.text)
        
    df8 = pd.DataFrame(h, columns= ["Rainfall Storm"])
    df8 = df8.replace(r'^\s*$', "0.0 mm", regex=True)
    #
    tabell = [df1, df2, df3, df4, df5, df6, df7, df8]
    result = pd.concat(tabell, axis=1)

    result.to_excel("weather" + str(int(datetime.now().day)) + ".xlsx")

            
    driver.quit()
    
schedule.every().day.at("00:00").do(job)

while 1:
    schedule.run_pending()
    time.sleep(1)

【问题讨论】:

  • 不确定你为什么要循环一个字符串来执行点击 => for klikk in dates, where dates'"forecast-day-view-date-bar__date";
  • selenium.common.exceptions.StaleElementReferenceException 当您的元素从 DOM 中删除时发生。这主要发生在页面刷新数据的动态网站上。从您的代码中,它可以在对 WebElement 的任何调用中收集元素后发生。 (.Iclick, text)
  • 没有“简单”的方法来解决这个问题。我已经在 selenium 之上实现了一个包装器,以避免这种情况并收集新元素。在您的情况下,请尝试使对元素 (i.e text) 的调用更接近被检索的元素 (find_by..)。
  • @nic 我已经尝试过您的提示,但错误仍然存​​在。你有关于如何在 python 中实现 selenium 包装器的链接吗?

标签: python selenium


【解决方案1】:

我感到困惑的是,您正在循环通过dates,这是您代码中的一个字符串

dates = "forecast-day-view-date-bar__date"
date = driver.find_elements_by_class_name(dates)

for klikk in dates:
    # 1st loop -> klikk = 'f'
    # 2st loop -> klikk = 'o'
    date[i].click()
    i = i +1
    if i==len(date):
       break

我认为你可以做的是,因为你想切换页面上的所有标签。

dates = "forecast-day-view-date-bar__date"
dates = driver.find_elements_by_class_name(dates) # assign as dates not date

for date in dates:
    date.click()

对于您的问题,这正是@Nic Laforge 所说的,移动了这些行:

for datoer in date:
    print(datoer.text)
    a.append(datoer.text)
    a.extend([""]*23)

在你的循环之前:

for klikk in dates:
    date[i].click()
    i = i +1
    if i==len(date):
        break

【讨论】:

  • 简单的解决方法是将 time.sleep 命令更改为 60 秒而不是 1 秒。奇怪的是,错误不再发生了。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多