【问题标题】:Download multiple file from Google cloud storage using Python使用 Python 从 Google 云存储下载多个文件
【发布时间】:2018-07-09 16:46:09
【问题描述】:

我正在尝试从 Google 云存储文件夹下载多个文件。我可以下载单个文件,但无法下载多个文件。我从this link 获取了这个参考,但似乎它不起作用。 代码如下:

# [download multiple files]
bucket_name = 'bigquery-hive-load'
# The "folder" where the files you want to download are
folder="/projects/bigquery/download/shakespeare/"

# Create this folder locally
if not os.path.exists(folder):
    os.makedirs(folder)

# Retrieve all blobs with a prefix matching the folder
    bucket=storage_client.get_bucket(bucket_name)
    print(bucket)
    blobs=list(bucket.list_blobs(prefix=folder))
    print(blobs)
    for blob in blobs:
        if(not blob.name.endswith("/")):
            blob.download_to_filename(blob.name)

# [End download to multiple files]

有没有办法下载与模式(名称)或其他内容匹配的多个文件。由于我是从 bigquery 导出文件,因此文件名将如下所示:

shakespeare-000000000000.csv.gz
shakespeare-000000000001.csv.gz
shakespeare-000000000002.csv.gz
shakespeare-000000000003.csv.gz

参考:下载单个文件的工作代码:

# [download to single files]

edgenode_destination_uri = '/projects/bigquery/download/shakespeare-000000000000.csv.gz'
bucket_name = 'bigquery-hive-load'
gcs_bucket = storage_client.get_bucket(bucket_name)
blob = gcs_bucket.blob("shakespeare.csv.gz")
blob.download_to_filename(edgenode_destination_uri)
logging.info('Downloded {} to {}'.format(
    gcs_bucket, edgenode_destination_uri))

# [end download to single files]

【问题讨论】:

    标签: python python-3.x google-cloud-platform google-cloud-storage


    【解决方案1】:

    经过一些试验,我解决了这个问题,也无法阻止自己在这里发帖。

    bucket_name = 'mybucket'
    folder='/projects/bigquery/download/shakespeare/'
    delimiter='/'
    file = 'shakespeare'
    
    # Retrieve all blobs with a prefix matching the file.
    bucket=storage_client.get_bucket(bucket_name)
    # List blobs iterate in folder 
    blobs=bucket.list_blobs(prefix=file, delimiter=delimiter) # Excluding folder inside bucket
    for blob in blobs:
       print(blob.name)
       destination_uri = '{}/{}'.format(folder, blob.name) 
       blob.download_to_filename(destination_uri)
    

    【讨论】:

      【解决方案2】:

      看起来您的 python 代码中的缩进级别可能是错误的。以# Retrieve all blobs with a prefix matching the folder 开头的块在上面的if 范围内,因此如果文件夹已经存在,则永远不会执行。

      试试这个:

      # [download multiple files]
      bucket_name = 'bigquery-hive-load'
      # The "folder" where the files you want to download are
      folder="/projects/bigquery/download/shakespeare/"
      
      # Create this folder locally
      if not os.path.exists(folder):
          os.makedirs(folder)
      
      # Retrieve all blobs with a prefix matching the folder
      bucket=storage_client.get_bucket(bucket_name)
      print(bucket)
      blobs=list(bucket.list_blobs(prefix=folder))
      print(blobs)
      for blob in blobs:
          if(not blob.name.endswith("/")):
              blob.download_to_filename(blob.name)
      
      # [End download to multiple files]
      

      【讨论】:

      • 它也有同样的问题。当我打印 blob 时,它返回空 []
      猜你喜欢
      • 1970-01-01
      • 2020-04-25
      • 1970-01-01
      • 1970-01-01
      • 2022-01-09
      • 2015-04-29
      • 1970-01-01
      • 2020-01-18
      • 2020-04-29
      相关资源
      最近更新 更多