【发布时间】:2022-09-23 22:55:48
【问题描述】:
即使在执行了下面thread 中建议的 enable_download_headless(driver, path) 之后,文件的下载也是不正确的。虽然非无头版本始终可以正确下载站点文件,但无头版本下载“chargeinfo.xhtml”摘录,这是下载页面链接的最后一个扩展名“https://www.xxxxx. de/xxx/chargeinfo.xhtml\"。有趣的是,当我在非无头模式下调用建议的 enable_download_headless(driver, path) 时,它也会下载“chargeinfo.xhtml”。
此外,在单击下载之前截取屏幕截图会显示与非 headless 相同的网页布局。
非常感谢任何帮助。
这是我的驱动程序设置:
def cd_excerpt_from_uc():
## declare driver and allow
options = webdriver.ChromeOptions()
##declaring headless
options.add_argument(\"--headless\")
user_agent = \'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36\'
options.add_argument(f\'user-agent={user_agent}\')
options.add_argument(\'--ignore-certificate-errors\')
options.add_argument(\'--allow-running-insecure-content\')
options.add_argument(\"--window-size=1920,1080\")
driver_path = \"path/to/chromedriver\"
driver = webdriver.Chrome(driver_path,options=options)
####cause the non headless version to also download \"chargeinfo.xhtml\"
enable_download_headless(driver, \"/Download/Path/\")
driver.get(\"https://www.xxxxx.de/xxx/chargeinfo.xhtml\")
time.sleep(10)
driver.find_element(\'xpath\', \"//span[@class=\'ui-button-text ui-c\' and contains(text(), \'Download\')]\").click()
def enable_download_headless(browser,download_dir):
browser.command_executor._commands[\"send_command\"] = (\"POST\", \'/session/$sessionId/chromium/send_command\')
params = {\'cmd\':\'Page.setDownloadBehavior\', \'params\': {\'behavior\': \'allow\', \'downloadPath\': download_dir}}
browser.execute(\"send_command\", params)
标签: python selenium download web-crawler