【问题标题】:Reading elements from html selenium从 html selenium 中读取元素
【发布时间】:2015-07-06 14:00:49
【问题描述】:

我正在尝试使用 selenium 记录 tf2 市场上的每个项目。我正在尝试在出售的文件中记录每件商品的名称。 This 是该页面的链接。我认为是这个标签,我只是不知道如何在文本文件中引用和记录名称,每个名称都换行。

<span id="result_0_name" class="market_listing_item_name" style="color; #7D6D00;">

编辑 1:

我已经使用了 alecxe 的解决方案,它适用于我现在尝试运行它以选择下一个按钮然后再次运行的第一页。但无济于事,这是我正在尝试的。

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

from selenium import webdriver
url="http://steamcommunity.com/market/search?appid=440#p1_popular_desc"
driver = webdriver.Firefox()
driver.get(url)

x=1
while x==1:
    WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.market_listing_row")))
    time.sleep(5)
    results = [item.text for item in driver.find_elements_by_css_selector("div.market_listing_row .market_listing_item_name")]
    time.sleep(5)
    driver.find_element_by_id('searchResults_btn_next').click()
    with open("output.dat", "a") as f:
        for item in results:
            f.write(item + "\n")

这会产生这个错误

Traceback (most recent call last):
  File "name.py", line 14, in <module>
    results = [item.text for item in driver.find_elements_by_css_selector("div.market_listing_row .market_listing_item_name")]
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 61, in text
    return self._execute(Command.GET_ELEMENT_TEXT)['value']
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 402, in _execute
    return self._parent.execute(command, params)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 175, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 166, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: Element is no longer attached to the DOM
Stacktrace:
    at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:8956)
    at Utils.getElementAt (file:///tmp/tmpUpLsV7/extensions/fxdriver@googlecode.com/components/command-processor.js:8546)
    at WebElement.getElementText (file:///tmp/tmpUpLsV7/extensions/fxdriver@googlecode.com/components/command-processor.js:11704)
    at DelayedCommand.prototype.executeInternal_/h (file:///tmp/tmpUpLsV7/extensions/fxdriver@googlecode.com/components/command-processor.js:12274)
    at DelayedCommand.prototype.executeInternal_ (file:///tmp/tmpUpLsV7/extensions/fxdriver@googlecode.com/components/command-processor.js:12279)
    at DelayedCommand.prototype.execute/< (file:///tmp/tmpUpLsV7/extensions/fxdriver@googlecode.com/components/command-processor.js:12221)

任何帮助都将不胜感激,即使它是指南的链接

【问题讨论】:

    标签: python file selenium selenium-webdriver web-scraping


    【解决方案1】:

    您可以从具有market_listing_item_name 类名的元素中获取名称,这些元素位于具有div 类的div 元素中:

    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    from selenium import webdriver
    
    url = "http://steamcommunity.com/market/search?appid=440"
    driver = webdriver.Chrome()
    driver.get(url)
    
    # wait for results
    WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.market_listing_row")))
    
    results = [item.text for item in driver.find_elements_by_css_selector("div.market_listing_row .market_listing_item_name")]
    
    driver.quit()
    
    # dump results to a file
    with open("output.dat", "wb") as f:
        for item in results:
            f.write(item + "\n")
    

    这是运行脚本后output.dat文件的内容:

    Mann Co. Supply Crate Key
    The Powerhouse Weapons Case
    The Concealed Killer Weapons Case
    Earbuds
    Bill's Hat
    Gun Mettle Campaign Pass
    Tour of Duty Ticket
    Genuine AWPer Hand
    Specialized Killstreak Kit
    Gun Mettle Key
    

    【讨论】:

    • @DanielPrinsloo 使用了浏览器开发工具和“检查元素”功能,还应用了一些 CSS 知识。
    • 你知道有什么好的在线指南来学习 CSS 的工作原理吗?
    • 请看一下编辑,因为我试图进一步修改它
    • 没关系,我只需要添加一个睡眠来让我编辑的页面完全加载
    猜你喜欢
    • 2013-03-08
    • 2013-09-25
    • 2011-07-21
    • 1970-01-01
    • 2016-09-27
    • 2021-09-09
    • 2019-07-21
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多