【发布时间】:2019-06-04 01:02:46
【问题描述】:
我正在运行 chromedriver 来尝试从网站上抓取一些数据。没有无头选项,一切正常。但是,当我添加选项时,webdriver 需要很长时间才能加载 url,并且当我尝试查找一个元素(在没有 --headless 的情况下运行时找到)时,我收到一个错误。
使用打印语句并在url“加载”后获取html,我发现没有html,它是空的(见下面的输出)。
class Fidelity:
def __init__(self):
self.url = 'https://eresearch.fidelity.com/eresearch/gotoBL/fidelityTopOrders.jhtml'
self.options = Options()
self.options.add_argument("--headless")
self.options.add_argument("--window-size=1500,1000")
self.driver = webdriver.Chrome(executable_path='.\\dependencies\\chromedriver.exe', options = self.options)
print("init")
def initiate_browser(self):
self.driver.get(self.url)
time.sleep(5)
script = self.driver.execute_script("return document.documentElement.outerHTML")
print(script)
print("got url")
def find_orders(self):
wait = WebDriverWait(self.driver, 15)
data= wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, '[id*="t_trigger_TSLA"]'))) #ERROR ON THIS LINE
这是整个输出:
init
<html><head></head><body></body></html>
url
Traceback (most recent call last):
File "C:\Users\Zachary\Documents\Python\Tesla Stock Info\Scraper.py", line 102, in <module>
orders = scrape.find_tesla_orders()
File "C:\Users\Zachary\Documents\Python\Tesla Stock Info\Scraper.py", line 75, in find_tesla_orders
tesla = self.driver.find_element_by_xpath("//a[@href='https://qr.fidelity.com/embeddedquotes/redirect/research?symbol=TSLA']")
File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 978, in find_element
'value': value})['value']
File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//a[@href='https://qr.fidelity.com/embeddedquotes/redirect/research?symbol=TSLA']"}
(Session info: headless chrome=74.0.3729.169)
(Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Windows NT 10.0.17763 x86_64)
更新代码的新错误:
init
<html><head></head><body></body></html>
url
Traceback (most recent call last):
File "C:\Users\Zachary\Documents\Python\Tesla Stock Info\Scraper.py", line 104, in <module>
orders = scrape.find_tesla_orders()
File "C:\Users\Zachary\Documents\Python\Tesla Stock Info\Scraper.py", line 76, in find_tesla_orders
tesla = wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, '[id*="t_trigger_TSLA"]')))
File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
我尝试通过谷歌找到答案,但没有任何建议有效。其他人对某些网站有这个问题吗?任何帮助表示赞赏。
更新
不幸的是,这个脚本仍然无法运行,webdriver 在无头时由于某种原因无法正确加载页面,即使在没有使用无头选项运行此脚本的情况下一切正常。
【问题讨论】:
-
试试firefox,我也从start bage开始得到chrome-vebdriver,忽略url和--headless
-
非常感谢。我不知道为什么我没有考虑尝试不同的浏览器。我一直用铬。不知道为什么有些网站不能使用 chrome headless。无论如何,谢谢。
标签: python selenium-webdriver selenium-chromedriver