【问题标题】:Looping through elements with Selenium and ChromeDriver使用 Selenium 和 ChromeDriver 循环遍历元素
【发布时间】:2022-02-12 07:21:10
【问题描述】:

我无法解决以下问题。 我正在尝试从以下网页收集数据:https://localhelp.healthcare.gov/#/results?q=UTAH&lat=0&lng=0&city=&state=UT&zip_code=&mp=FFM

我的方法是使用 Selenium chrome 驱动程序为每个医疗保健代理收集数据,关闭此网页,但不知道如何循环遍历每条记录并将数据添加到每个创建的列表中。到目前为止,我可以收集一条记录的数据,但我的问题在于我的循环。我如何将每条记录识别为代理,并将其添加到我的数据框中以进行输出?这是我的代码:

from selenium import webdriver  # connect python with webbrowser-chrome
import time
import pandas as pd
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome('C:/Users/picka/Documents/chromedriver.exe')
driver.maximize_window()

url = 'https://localhelp.healthcare.gov/#/results?q=UTAH&lat=0&lng=0&city=&state=UT&zip_code=&mp=FFM'

name = []
phone = []
email = []

def go_to_network():
    driver.get(url)

    for agent in driver.find_elements_by_xpath('class.qa-flh-results-list'):
        
        get_name = (WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.qa-flh-resource-name"))).text)
        get_phone = (WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.qa-flh-resource-phone"))).text)
        get_email = (WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.ds-u-overflow--hidden.ds-u-truncate.ds-u-display--inline-block"))).text)

        name.append(get_name)
        phone.append(get_phone)
        email.append(get_email)


go_to_network()


record_output = {'Agent Name': name, 'Phone': phone, 'Email':  email}
df = pd.DataFrame(record_output)
df.to_csv(r'C:\Users\picka\Documents\Dev\Reports\Agent-data.csv', header=True, index=False)
print(df)

【问题讨论】:

    标签: python-3.x selenium loops list-comprehension webdriverwait


    【解决方案1】:

    要使用Selenium 提取和打印所有代理姓名电话电子邮件,您可以使用List Comprehension 诱导WebDriverWait对于visibility_of_all_elements_located(),您可以使用以下任一Locator Strategies

    • 代码块:

      driver.get('https://localhelp.healthcare.gov/#/results?q=UTAH&lat=0&lng=0&city=&state=UT&zip_code=&mp=FFM')
      get_name = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.qa-flh-resource-name")))]
      get_phone = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a.qa-flh-resource-phone")))]
      get_email = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a.ds-u-overflow--hidden.ds-u-truncate.ds-u-display--inline-block")))]
      for i,j,k in zip(get_name, get_phone, get_email):
        print(f"{i}'s' phone number is {j} and email is {k}")
      driver.quit()
      
    • 控制台输出:

      Wesley Elton's' phone number is (801) 404 - 2424 and email is wes@wallbrokers.com
      Raquel Bell's' phone number is (801) 842 - 2870 and email is raquel.bell@enroll365.org
      Brandon Berglund's' phone number is (801) 981 - 9414 and email is Brandon@BerglundIns.com
      Steven Cochran's' phone number is (801) 800 - 8360 and email is steve.cochran@gbsbenefits.com
      victoria dang's' phone number is (801) 462 - 5190 and email is victoriawfg@yahoo.com
      Dan Jessop's' phone number is (435) 232 - 8833 and email is dejessop@hotmail.com
      Billy Gerdts's' phone number is (801) 280 - 1162 and email is bgerdts@gginsurancegroup.com
      Michael Saldana's' phone number is (801) 879 - 1032 and email is saldana.michael25@gmail.com
      Brandon Johnson's' phone number is (435) 249 - 0725 and email is brandon@msiagency.com
      Matthew Selph's' phone number is (801) 918 - 3945 and email is selph7@gmail.com
      
    • 注意:您必须添加以下导入:

      from selenium.webdriver.support.ui import WebDriverWait
      from selenium.webdriver.common.by import By
      from selenium.webdriver.support import expected_conditions as EC
      

    【讨论】:

    • 太棒了,至少可以适当地获取每条记录。但是如何将每个代理记录添加为我的数据框中自己的行以用于 csv 输出?
    • 听起来像是一个不同的问题。请您对您的新要求提出一个新问题吗?
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-11-08
    • 2018-06-05
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-08-29
    相关资源
    最近更新 更多