【发布时间】:2019-03-07 11:37:55
【问题描述】:
下面的awk会在一个目录下创建子目录(总是file1的最后一行,每个块用空行隔开),如果第2行的数字(总是格式的前6位) file2 的 xx-xxxx) 在 file1 的 $2 中找到。这是当前的 awk 输出。
如果存在匹配项并且在目录中创建了子目录,则 file2 中对应的第 1 行 https 将始终是用于下载的 zip 文件的链接。我似乎无法在子文件夹中创建该链接,下载并解压缩 .zip。下载代码执行并下载 zip,但必须手动添加到终端。我为这篇长文道歉,想包括所有细节来解决这个问题
文件1
xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001
yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002
文件2
https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232
awk 编辑
cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR { for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next }
# retrieve the first 7-char of each line in file2 as the key to test
against the above hash
{ k = substr($0, 1, 7) }
# if find k, then print
k in a { print a[k] "\t" $0 "\t" l }
# save prev line to 'l' which is supposed to be the URL
{ l = $0 }
' RS= file1 RS='\n' file2 | while IFS=$'\t' read -r base_dir sub_dir link;
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && { filename="%s"; unzip
"${filename##*/}"; }'
done
所需的 awk 输出
FolderName_002_002 --- directory
19-0v02-xxx_000_001 --- sub folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v05-xxx_000_001 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v31-xxx-001-000 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
【问题讨论】:
标签: awk