【问题标题】:How to get contents for elements that have the same class如何获取具有相同类的元素的内容
【发布时间】:2020-12-23 06:22:19
【问题描述】:

我正在尝试使用硒提取产品信息。这是页面的 URL https://www.dell.com/en-us/shop/dell-laptops/sr/laptops/11th-gen-intel-core?appliedRefinements=23775

首先,我得到了我试图抓取的元素的父类,它们是计算机模型、CPU 等,它们被包含在卡片中

卡片“stack-system ps-stack”的父类,但是当我尝试在类中查找元素列表时,它是空的。

driver = webdriver.Chrome()
url = "https://www.dell.com/en-us/shop/dell-laptops/sr/laptops/11th-gen-intel-core?appliedRefinements=23775"
classname_main = "stack-system ps-stack"
driver.get(url)
driver.implicitly_wait(50)
products = driver.find_elements_by_class_name("stack-system ps-stack")
print(products)

我也想获取卡片的内容。

【问题讨论】:

    标签: python selenium xpath css-selectors webdriverwait


    【解决方案1】:

    例如,定位器 class_name 不接受空格或多类。改为使用 css_selector:

    driver.find_elements_by_css_selector(".stack-system.ps-stack")
    

    【讨论】:

      【解决方案2】:

      提取产品名称,例如新 Inspiron 14 5000 笔记本电脑等使用 Selenium 您可以使用以下任一 Locator Strategies

      • 使用css_selectorget_attribute("innerHTML")

        print([my_elem.get_attribute("innerHTML") for my_elem in driver.find_elements_by_css_selector("li.Fruit")])
        
      • 使用xpathtext属性:

        print([my_elem.text for my_elem in driver.find_elements_by_xpath("//li[@class='Fruit']")])
        

      理想情况下,您需要为visibility_of_all_elements_located() 诱导WebDriverWait,您可以使用以下任一Locator Strategies

      • 使用CSS_SELECTORget_attribute("innerHTML")

        print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "article.stack-system.ps-stack h3 > a")))])
        
      • 使用XPATHtext属性:

        print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//article[@class='stack-system ps-stack']//h3/a")))])
        
      • 控制台输出:

        ['New Inspiron 14 5000 Laptop', 'New Inspiron 14 5000 2-in-1 Laptop (Dune)', 'New Inspiron 15 5000 Laptop', 'New Inspiron 14 5000 Laptop', 'New Inspiron 14 5000 Laptop', 'New Inspiron 15 5000 Laptop', 'New Inspiron 14 5000 2-in-1 Laptop (Dune)', 'New Inspiron 14 5000 2-in-1 Laptop (Titan Grey)', 'New Inspiron 14 5000 2-in-1 Laptop (Titan Grey)', 'New Inspiron 13 7000 2-in-1 Laptop', 'New Inspiron 15 5000 Laptop', 'New Inspiron 15 7000 2-in-1 Laptop']
        
      • 注意:您必须添加以下导入:

        from selenium.webdriver.support.ui import WebDriverWait
        from selenium.webdriver.common.by import By
        from selenium.webdriver.support import expected_conditions as EC
        

      结尾

      链接到有用的文档:

      【讨论】:

        【解决方案3】:

        使用这个 css 选择器查找所有文本。

        driver.get(url)
            
        print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"div.no-div-lines-layout"))).text)
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2021-12-10
          • 1970-01-01
          相关资源
          最近更新 更多