【问题标题】:How to download .torrent files with no ['content-disposition']?如何下载没有 ['content-disposition'] 的 .torrent 文件?
【发布时间】:2021-04-08 07:59:57
【问题描述】:

我正在尝试使用 requests 和 bs4 从 kali.org/downloads 快速下载 kali-linux live-amd64.iso.torrent 文件,但在 torrent 响应的标题中,没有:response.headers['content-disposition']My参考来自:How to download a file with .torrent extension from link with Python。我查看了 torrent_response 的标题,结果如下:

{'Server': 'nginx/1.14.2', 'Date': 'Thu, 08 Apr 2021 07:46:17 GMT', 'Content-Type': 'application/octet-stream', 'Content-Length': '274612', 'Connection': 'keep-alive', 'Last-Modified': 'Wed, 24 Feb 2021 17:39:18 GMT', 'ETag': '"60368f46-430b4"', 'X-Cache-Status': 'HIT', 'Accept-Ranges': 'bytes'}

class Download_Kali:
    
    def __init__(self, locator):
        self.locator = locator

    def locate_torrent(self):
        torrent_links = [torrent['href'] for torrent in self.locator.findAll('a', string='Torrent')]
        for live_iso in torrent_links:
            if 'live-amd64' in live_iso:
                print('[*] Downloading iso Torrent')
                return live_iso

    def install(self):
        import re, traceback
        try:
            with requests.get(Download_Kali(self.locator).locate_torrent()) as torrent_response:
                torrent_response.raise_for_status()
                
                disposition = torrent_response.headers
                print(disposition)
                #torrent_file = re.findall('filename="(.+)"', disposition)
                #if torrent_file:
                #    with open(torrent_file[0], 'wb') as f_torrent:
                #        f_torrent.write(torrent_response.content)
        except requests.HTTPError as err:
            print(traceback.format_exc())

【问题讨论】:

  • 为什么不直接使用.iso 图像?

标签: python python-3.x web-scraping


【解决方案1】:

如果您想使用.iso 图像,这是我对此的看法。我正在使用Net Installer,因为它的尺寸更轻。另外,我添加了一个进度条。

import shutil

import requests
from bs4 import BeautifulSoup
from tqdm import tqdm

with requests.Session() as connection:
    kali_page = connection.get("https://kali.org/downloads/").content
    iso_images = [
        t["href"] for t in
        BeautifulSoup(kali_page, "lxml").select("table a")
        if not t["href"].endswith(".torrent")
    ]
    for iso_image in iso_images:
        if "netinst-arm64" in iso_image:
            print(f"Downloading {iso_image}")
            file_name = iso_image.rsplit("/")[-1]
            with connection.get(iso_image, stream=True) as response, \
                    open(file_name, "wb") as output:
                total_size = int(response.headers.get('content-length', 0))
                with tqdm(
                        total=total_size / (32 * 1024.0),
                        unit='B',
                        unit_scale=True,
                        unit_divisor=1024,
                ) as progress_bar:
                    for data in response.iter_content(32 * 1024):
                        progress_bar.update(len(data))
                shutil.copyfileobj(response.raw, output)

这应该显示如下:

【讨论】:

  • 谢谢,效果很好。虽然这对我来说似乎很新鲜,但对我来说学习它也会很棒。
猜你喜欢
  • 1970-01-01
  • 2020-04-02
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2013-12-06
  • 1970-01-01
  • 2016-03-23
相关资源
最近更新 更多