要从 Deep Archive/Glacier 检索的 AWS 文件列表答案

【问题标题】：List of AWS files to retrieve from Deep Archive/Glacier要从 Deep Archive/Glacier 检索的 AWS 文件列表
【发布时间】：2021-07-10 16:51:49
【问题描述】：

所以我有一个文件，其中包含我想从 AWS Deep Archive/Glacier 检索的 AWS 文件列表。类似这样：

bucket-name/path/file1.ext
bucket-name/path/file2.ext
bucket-name/path/file3.ext
bucket-name/path/file4.ext
bucket-name/path/file5.ext
bucket-name/path/path/file1.ext
bucket-name/path/file6.ext
bucket-name/path/file7.ext

我想将此文件提供给一个脚本，该脚本将解析出存储桶，然后解析 AWS 上的文件位置，以便可以将其传递给如下命令，其中 $y 是存储桶，$x 是位置该存储桶中的文件：

aws s3api restore-object --restore-request Days=7,GlacierJobParameters={Tier=Bulk} --bucket "$y" --key "$x"

我有一个实际的脚本，它将使用文件列表从 AWS 复制这些内容，因此我也希望将其用于检索请求，以及需要在文件可用之前完成的操作。

我已经想通了，我可以用这个来获取桶： awk -F'/' '{print $1}'

这个来获取文件的路径 cut -if2- -d '/' $ndir

我不是一个强大的编码员，我希望得到一些帮助。我认为可能还有其他可用于循环的命令，也可能只是一条 awk 行，但我只是没有做对。

【问题讨论】：

您可以将 awk 命令的输出传递给 bash 数组。说myarray=$(awk -F'/' '{print $1}' myfile.txt) 会将每个桶值放入myarray。你可以用这个y=$(myarray[1])分配y值，用n改变你想要的nth元素的数字。或者在你的脚本中循环它。为了更好地理解，请为所需的输出添加一些示例。
对于文件中的每一行，它将为我们的示例文件中的第一行运行如下命令：aws s3api restore-object --restore-request Days=7,GlacierJobParameters={Tier=Bulk} --bucket "bucket-name" --key "/path/file1.ext"
抱歉误导。我在第一条评论中写道myarray=$(awk -F'/' '{print $1}' myfile.txt) 这只是存储一个元素。要将输出写入数组，请注意外部括号myarray=($(awk -F'/' '{print $1}' myfile.txt))

标签： bash amazon-s3 awk

【解决方案1】：

为了更好的可读性，我更改了示例目录的名称

$ cat awsfile.txt 
bucket-name_1/path_1/file1.ext
bucket-name_2/path_2/file2.ext
bucket-name_3/path_3/file3.ext
bucket-name_4/path_4/file4.ext
bucket-name_5/path_5/file5.ext
bucket-name_6/path_6/path_9/file1.ext
bucket-name_7/path_7/file6.ext
bucket-name_8/path_8/file7.ext

下面是 script 文件。由于我无法帮助您将其集成到您的脚本中，因此我希望这是一个非常干净的代码，以便您理解：

$ cat awsscript.sh
#!bin/bash

declare -a buckarr
declare -a extarr

buckarr=($(awk 'sub(/\//," ") {print $1"\t"}' awsfile.txt))
extarr=($(awk 'sub(/\//," ") {print $2"\t"}' awsfile.txt))

buckarrlength=${#buckarr[@]}

for ((i=0; i<buckarrlength; i++))
do
        y=${buckarr[i]}
        x=${extarr[i]}
        echo "Bucket info:" $y "++++ Path to file:" $x "///replace this command line with any command you wish..."
done

输出：

  $ bash awsscript.sh
    Bucket info: bucket-name_2 ++++ Path to file: path_2/file2.ext ///replace this command line with any command you wish...    
    Bucket info: bucket-name_2 ++++ Path to file: path_2/file2.ext ///replace this command line with any command you wish...
    Bucket info: bucket-name_3 ++++ Path to file: path_3/file3.ext ///replace this command line with any command you wish...
    Bucket info: bucket-name_4 ++++ Path to file: path_4/file4.ext ///replace this command line with any command you wish...
    Bucket info: bucket-name_5 ++++ Path to file: path_5/file5.ext ///replace this command line with any command you wish...
    Bucket info: bucket-name_6 ++++ Path to file: path_6/path_9/file1.ext ///replace this command line with any command you wish...
    Bucket info: bucket-name_7 ++++ Path to file: path_7/file6.ext ///replace this command line with any command you wish...
    Bucket info: bucket-name_8 ++++ Path to file: path_8/file7.ext ///replace this command line with any command you wish...

我没有任何 aws 经验，但 IMO 您只需将 echo 行替换为您希望运行的代码行并放入变量 $x 和 $y 就像在 echo 行中一样

这个答案对你有帮助吗？

额外

查看数组元素列表

$ echo "${extarr[@]}"

查看数组元素的长度

$ echo "${#buckarr[@]}"

【讨论】：