在单击按钮后下载 PDF，带有没有 url 的 href 或 .pdf答案

【问题标题】：Download a PDF after the click button with href without url or .pdf在单击按钮后下载 PDF，带有没有 url 的 href 或 .pdf
【发布时间】：2017-09-16 19:26:50
【问题描述】：

我正在尝试保存在网页上模拟链接“PDF”上的点击按钮后获得的内容。当我这样做时，会下载 PDF，但我想将其保存在特定文件中。我使用从 urllib 库中检索读取了一些内容，但我无法获取 PDF 的 URL。让我解释一下：

<a class="at-actionDownloadPdfLink" href="/candidates/downloadSeekerDocument.aspx?sPath=private_0/resumes/4ykqgejxuh95ib6r">PDF</a>

当我提交点击按钮时，我可以轻松下载 PDF，但将其保存在正确的位置时遇到了很大的问题。激活点击按钮的代码：

submit3 = driver.find_element_by_id("linkResumeTitle")  
submit3.click()

谢谢

【问题讨论】：

可以添加下载PDF的代码吗？
我编辑了我的代码。我使用包 beautifulsoup，以便在正确的位置（ID 为 LinkResumeTitle），然后单击。
这似乎是selenium 代码。 python-requests 和 beautifulsoup 与此问题有何关联？
好吧，我也可以使用这些软件包来执行我的任务...如果您对如何做到这一点有任何想法。

标签： python html pdf selenium-webdriver beautifulsoup

【解决方案1】：

如果您希望能够自动将文件下载到所需的文件夹，您可以使用Preferences，如下所示：

my_folder = "/I/Want/to/save/file/here"

from selenium.webdriver.firefox.firefox_profile import FirefoxProfile

profile = FirefoxProfile ()
profile.set_preference("browser.download.folderList",2)
profile.set_preference("browser.download.manager.showWhenStarting",False) 
profile.set_preference("browser.download.dir", my_folder)
profile.set_preference("browser.helperApps.neverAsk.saveToDisk",'application/pdf')
driver = webdriver.Firefox(firefox_profile=profile)
driver.get(URL)
submit3 = driver.find_element_by_id("linkResumeTitle")
submit3.click()

或者您可以获得所需的URL

link = driver.find_element_by_id("linkResumeTitle").get_attribute('href')

然后尝试

import urllib
import os
urllib.request.urlretrieve(link, os.path.join(my_folder, "file.pdf"))

下载文件

【讨论】：

感谢您的回答。用 webdriver.Chrome() 代替 firefox 一样吗？
没有。 it looks differently for Chrome
我不明白。我使用 Chrome 来获取直到“submit3.click”。从那时起，我不能使用您的其余代码？好吧，我试过了，它不起作用..
如果我将文件名留空，是否意味着它将使用文件原名？