【问题标题】:download large file from drive to colab将大文件从驱动器下载到 colab
【发布时间】:2020-11-17 07:45:49
【问题描述】:

我有一个指向公共 Google 云端硬盘托管文件的链接:

https://drive.google.com/uc?id=19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8&export=download

以下是适用于不同文件和链接的 .sh 脚本:

#!/usr/bin/env bash
function gdrive_download () { # credit to https://github.com/ethanjperez/convince
  CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=$1" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')
  wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$CONFIRM&id=$1" -O $2
  rm -rf /tmp/cookies.txt
}

mkdir -p Models/real-fixed-cam Models/real-hand-held
gdrive_download 1yiNsSkPYoBZ55fSQ1iwb1io9QL_PcR2i Models/real-fixed-cam/netG_epoch_12.pth
gdrive_download 13HckO9fPAKYocdB_CAC5n8uyM3xQ2MpG Models/real-hand-held/netG_epoch_12.pth

上面的脚本在 Colab 中是这样调用的:

!wget https://gist.githubusercontent.com/andreyryabtsev/458f7450c630952d1e75e195f94845a0/raw/0b4336ac2a2140ac2313f9966316467e8cd3002a/download.sh
!chmod +x download.sh
!./download.sh

我已对其进行了如下调整以满足我的需要:

#!/usr/bin/env bash
function gdrive_download () { # credit to https://github.com/ethanjperez/convince
  CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=$1" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')
  wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$CONFIRM&id=$1" -O $2
  rm -rf /tmp/cookies.txt
}

mkdir -p pix2pix/checkpoint
gdrive_download 19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8 pix2pix/checkpoint/weights.zip

以上代码在 colab 中调用:

!wget https://gist.githubusercontent.com/Daryl149/070397c9cb3539f5cd01173f6c44200d/raw/207a76e94e70e6c9334f48c25b4998f4fd1b95e3/download.sh
!chmod +x download.sh
!./download.sh

文件夹已正确创建。但它并没有将 500mb+ 的 zip 文件下载到 checkpoints 文件夹,而是从下载确认页面下载了 html。 在日志记录中,脚本似乎每次都会获取一个新的下载确认字符串,这通常会强制 Google Drive 下载而不进行病毒扫描:

--2020-07-27 21:55:21--  https://drive.google.com/uc?export=download&confirm=umyj&id=19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8
Resolving drive.google.com (drive.google.com)... 74.125.142.138, 74.125.142.101, 74.125.142.100, ...
Connecting to drive.google.com (drive.google.com)|74.125.142.138|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘pix2pix/checkpoint/weights.zip’

【问题讨论】:

    标签: shell google-drive-api google-colaboratory


    【解决方案1】:

    试试这个

    !gdown --id 19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8
    

    然后,您可以使用 !mkdir 创建一个新目录或将 weights.zip 移动到那里。

    【讨论】:

      【解决方案2】:

      根据@korakot 的回答,在 Colab 中实现结果的完整工作代码是:

      !gdown https://drive.google.com/uc?id=19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8
      !mkdir /content/Person_remover/pix2pix/checkpoint
      import shutil
      shutil.move("/content/Person_remover/weights.zip", "/content/Person_remover/pix2pix/checkpoint")
      

      【讨论】:

      • 可能需要 !mkdir -p
      • 它按原样工作,因为 /content/Person_remover/pix2pix 已经存在于我的默认项目中。但无论如何包含 -p 可能更干净?
      猜你喜欢
      • 2020-11-15
      • 2020-12-28
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-10-21
      • 1970-01-01
      • 1970-01-01
      • 2018-07-21
      相关资源
      最近更新 更多