使用 Python 3 part.2 从网站下载所有 pdf 文件答案

【问题标题】：Downloading all the pdf file from website with Python 3 part.2使用 Python 3 part.2 从网站下载所有 pdf 文件
【发布时间】：2026-02-18 23:20:03
【问题描述】：

我重写了程序，因此它可以在 URL 被重定向后工作，但我无法保存文件，也就是在下载文件夹中查看它。这是网站https://fraser.stlouisfed.org/title/1339#518552

       from bs4 import BeautifulSoup
       import urllib3
       urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
       import re
       from urllib import request
       import requests
       import time

       #access the website
       http = urllib3.PoolManager()
       url='https://fraser.stlouisfed.org/title/1339#518552/title/1339/item/558539'
       response = http.request('GET', url)
       soup = BeautifulSoup(response.data)

       download_links=[]
       #i found the part of the name files share and tried to append with that
       for link in soup.find_all('a', attrs={'href': re.compile("/title/1339/item/5")}):
       download_links.append('https://fraser.stlouisfed.org/'+link.get('href'))

       # this part deals with redirected page
       #I am trying to make it work for only one link first.
       response_two= http.request('GET', download_links[1])
       soup = BeautifulSoup(response_two.data)


       for link in soup.find_all('a', attrs={'href': re.compile("/files/docs/publications/cfc/")}):
             urlfin="https://fraser.stlouisfed.org/" + link['href']
             request.urlretrieve(urlfin)

程序运行，但没有下载任何内容，谁能帮忙找出问题所在？

【问题讨论】：

标签： python-3.x pdf web-scraping download

【解决方案1】：

只需选择流式传输的文件名。

import urllib.request
urllib.request.urlretrieve(urlfin, "test.pdf")

【讨论】：