要添加到@Maxim 的答案,您可以考虑在调用gsutil 时使用-m 参数以允许并行复制。
gsutil -m cp -r gs://bucket-name/folder1/folder_to_copy gs://bucket-name/folder1/new_folder
-m arg 启用并行性。
正如gsutil 文档中所建议的那样,-m 参数可能不会在网络速度较慢的情况下产生更好的性能(即,在家里)。但是对于跨桶复制(数据中心中的机器)的情况,性能可能会“显着提高”,以引用 gsutil 手册。见下文
-m Causes supported operations (acl ch, acl set, cp, mv, rm, rsync,
and setmeta) to run in parallel. This can significantly improve
performance if you are performing operations on a large number of
files over a reasonably fast network connection.
gsutil performs the specified operation using a combination of
multi-threading and multi-processing, using a number of threads
and processors determined by the parallel_thread_count and
parallel_process_count values set in the boto configuration
file. You might want to experiment with these values, as the
best values can vary based on a number of factors, including
network speed, number of CPUs, and available memory.
Using the -m option may make your performance worse if you
are using a slower network, such as the typical network speeds
offered by non-business home network plans. It can also make
your performance worse for cases that perform all operations
locally (e.g., gsutil rsync, where both source and destination
URLs are on the local disk), because it can "thrash" your local
disk.
If a download or upload operation using parallel transfer fails
before the entire transfer is complete (e.g. failing after 300 of
1000 files have been transferred), you will need to restart the
entire transfer.
Also, although most commands will normally fail upon encountering
an error when the -m flag is disabled, all commands will
continue to try all operations when -m is enabled with multiple
threads or processes, and the number of failed operations (if any)
will be reported as an exception at the end of the command's
execution.
注意:在撰写本文时,python3.8 似乎导致-m 标志出现问题。使用python3.7。更多信息Github Issue