【问题标题】:grabbing all of data inside a div with selenium用 selenium 抓取 div 内的所有数据
【发布时间】:2023-03-07 07:49:01
【问题描述】:

我想以编程方式从 gog 网站最优惠部分的游戏中获取所有名称和价格。

我选择了最优惠的部分,但我不确定如何迭代此部分中的内容(如果可能的话)以找到每个为游戏做广告的 div 并将名称和价格放入列表中。

这是我到目前为止所得到的。这可能吗?我该怎么做?

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
driver = webdriver.Chrome(executable_path=r"../Downloads/chromedriver.exe")

driver.get('https://gog.com')
wait = WebDriverWait(driver,30)
time.sleep(30); # give it a while make sure it loads
top_deals_section = driver.get_element_by_id("f0a67846-5310-11ea-ba0a-fa163eee4696")# this is the top deals section
names = []
prices = []
for div in top_deals_section:
    if div.class == 'title-product_title_title':
    names.append(div)
    ## same for prices here

where i got 'title-product_title_title' from

【问题讨论】:

    标签: python html selenium


    【解决方案1】:

    获取Hot Deals下的所有产品名称和价格。 诱导WebDriverWait() 和visibility_of_element_located() 加载元素,然后使用下面的xpath获取产品名称和价格。

    注意:有些元素在网页上不可见,因此使用element.get_attribute("textContent") 来获取值。

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
    driver = webdriver.Chrome(executable_path=r"../Downloads/chromedriver.exe")
    driver.get('https://gog.com')
    WebDriverWait(driver,30).until(EC.visibility_of_element_located((By.XPATH,"//div[@class='container' and contains(.,'Hot Deals')]")))
    names = []
    prices = []
    
    for name,price in zip(driver.find_elements_by_xpath("//div[@class='container' and contains(.,'Hot Deals')]//div[@class='product-tile__title']"),driver.find_elements_by_xpath("//div[@class='container' and contains(.,'Hot Deals')]//span[@class='product-tile__price-discounted _price']")):
        names.append(name.get_attribute("textContent"))
        prices.append(price.get_attribute("textContent").strip())
    
    print(names)
    print(prices)
    

    输出

    ['Nova Drift', 'Shadow Tactics: Blades of the Shogun', "Baldur's Gate: Enhanced Edition", 'Fallout: New Vegas Ultimate Edition', 'Frostpunk', 'XCOM® 2', 'Neverwinter Nights 2 Complete', 'Diablo + Hellfire', 'Stardew Valley', 'Unforeseen Incidents', 'Crypt of the NecroDancer', 'BATTLETECH - Mercenary Collection', 'Blade Runner', 'The Surge', 'The Witcher 3: Wild Hunt - Game of the Year Edition', 'SWAT 4: Gold Edition', 'The Bureau: XCOM® Declassified™', 'Styx: Master of Shadows', 'Iratus: Lord of the Dead', 'Divinity: Original Sin 2 - Definitive Edition', "Heaven's Vault", 'Dishonored: Complete Collection', 'Thronebreaker: The Witcher Tales', 'Vampire®: The Masquerade - Bloodlines™', 'Whispers of a Machine', 'Grim Dawn', 'Children of Morta', 'Through the Ages', 'Kingdom Come: Deliverance Royal Edition', 'Imperator: Rome', 'Outward', 'Crying Suns', 'Age of Wonders: Planetfall', 'GreedFall', 'Heroes of Might and Magic® 3: Complete', 'Deus Ex™ GOTY Edition']
    ['7.69', '8.79', '7.69', '7.49', '10.00', '8.79', '7.69', '6.59', '8.79', '10.39', '2.29', '23.79', '6.99', '6.19', '10.49', '4.00', '2.99', '5.00', '12.59', '15.00', '11.99', '21.99', '8.49', '7.69', '4.59', '4.00', '12.99', '6.19', '26.29', '23.49', '17.49', '15.59', '20.99', '29.49', '2.19', '0.69']
    

    【讨论】:

      【解决方案2】:

      请参考以下解决方案:

      elements = WebDriverWait(driver, 20).until(
              EC.presence_of_all_elements_located((By.XPATH, "//div[contains(@class,'custom-section custom-section--triplet')]//div[contains(@class,'product-tile__title')]")))
          for product in elements:
              print product.text
      

      注意:您需要在下面添加导入

      from selenium.webdriver.support.ui import WebDriverWait
      from selenium.webdriver.support import expected_conditions as EC
      from selenium.webdriver.common.by import By
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2018-06-30
        • 1970-01-01
        • 2019-09-03
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多