在循环中获取元素的 Selenium Python 问题答案

【问题标题】：Selenuim Python issue with getting elements in loop在循环中获取元素的 Selenium Python 问题
【发布时间】：2018-10-30 08:11:42
【问题描述】：

soup = BeautifulSoup(browser.page_source, "html.parser")
for h1 in soup.find_all('h2'):
    try:
        array.append("https://www.chamberofcommerce.com" + h1.find("a")['href'])
        print("https://www.chamberofcommerce.com" + h1.find("a")['href'])
    except:
        pass

input=browser.find_element_by_xpath('//a[@class="next"]')
while input:
    input.click()
    time.sleep(10)
    soup = BeautifulSoup(browser.page_source, "html.parser")

    for h1 in soup.find_all('h2'):
        try:
            array.append("https://www.chamberofcommerce.com" + h1.find("a")['href'])
            print("https://www.chamberofcommerce.com" + h1.find("a")['href'])
        except:
            pass

这部分代码删除了yellopages上列表的url，代码运行良好，直到我过去只从搜索的第一页删除url，现在我希望它点击下一步按钮，直到搜索页面完成, Foe Example 如果有 20 页的搜索，那么 selenuim 机器人应该点击下一步按钮并删除 url，直到它到达第 20 页，

请查看代码的逻辑，并且在机器人到达第 2 页后我收到以下错误，实际页数为 15，它在第 2 页崩溃：

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document

【问题讨论】：

标签： python-3.x selenium beautifulsoup

【解决方案1】：

while input 不是您需要的...请注意，一旦您单击“下一步”按钮，新页面就会加载，并且上一页中的所有 WebElements 都不再有效：您必须在每个页面上重新定义它们。试试下面的方法：

while True:
    try:
        browser.find_element_by_xpath('//a[@class="next"]').click()
    except:
        break

使用上面的代码，您应该能够在每个页面可用时单击“下一步”按钮。您可能还需要申请 ExplicitWait 以等待下一步按钮可点击：

wait.until(EC.element_to_be_clickable((By.XPATH, '//a[@class="next"]'))).click()

【讨论】：

【解决方案2】：

使用显式等待

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

...

t = 10 # Timeout

try:
    element = WebDriverWait(driver, t).until(
        EC.element_to_be_clickable((By.XPATH, "//a[@class='next']"))
    )
except:
    # handle element not found or unclickable

element.click()

...

【讨论】：