【发布时间】:2020-07-22 13:15:39
【问题描述】:
我正在尝试将文件从 GCS 复制到其他位置。但我需要使用云功能实时进行。 我创建了一个函数及其工作。但问题是,文件被多个文件夹复制了多次。
EG:
source file path: gs://logbucket/mylog/2020/07/22/log.csv
Expected Target: gs://logbucket/hivelog/2020/07/22/log.csv
我的代码:
from google.cloud import storage
def hello_gcs_generic(data, context):
sourcebucket=format(data['bucket'])
source_file=format(data['name'])
year = source_file.split("/")[1]
month = source_file.split("/")[2]
day = source_file.split("/")[3]
filename=source_file.split("/")[4]
print(year)
print(month)
print(day)
print(filename)
print(sourcebucket)
print(source_file)
storage_client = storage.Client()
source_bucket = storage_client.bucket(sourcebucket)
source_blob = source_bucket.blob(source_file)
destination_bucket = storage_client.bucket(sourcebucket)
destination_blob_name = 'hivelog/year='+year+'/month='+month+'/day='+day+'/'+filename
blob_copy = source_bucket.copy_blob(
source_blob, destination_bucket, destination_blob_name
)
blob.delete()
print(
"Blob {} in bucket {} copied to blob {} in bucket {}.".format(
source_blob.name,
source_bucket.name,
blob_copy.name,
destination_bucket.name,
)
)
输出:
你可以看到这个year=year=2020这是怎么来的?在这里面我也有像year=year=2020/month=month=07/这样的文件夹
我无法解决这个问题。
【问题讨论】:
-
这些答案对你有帮助吗?
-
是的@dustin-ingram 回答帮助了我
标签: python google-cloud-platform google-cloud-functions