这比你想象的要在 shell 中稳健地完成要困难得多。 (现有答案在常见情况下有效,但包含令人惊讶的文件名的档案会混淆它)。更好的选择是使用具有本机 zip 文件支持的语言——例如 Python。 (这还具有不需要多次打开输入文件的优点!)
如果单个文件足够小,您可以在内存中放置每个文件的几个副本,则以下内容将很好地工作:
read_files() {
python -c '
import sys, zipfile, zlib
zf = zipfile.ZipFile(sys.argv[1], "r")
for content_file in zf.infolist():
content = zlib.decompress(zf.read(content_file), zlib.MAX_WBITS|32)
for line in content.split("\n")[:-1]:
sys.stdout.write("%s\0%s\0" % (content_file.filename, line))
' "$@"
}
while IFS= read -r -d '' filename && IFS= read -r -d '' line; do
printf 'From file %q, read line: %s\n' "$filename" "$line"
done < <(read_files yourfile.zip)
如果您真的想将文件列表和文件读取操作彼此分开,那么稳健地执行此操作可能如下所示:
### Function: Extract a zip's content list in NUL-delimited form
list_files() {
python -c '
import sys, zipfile, zlib
zf = zipfile.ZipFile(sys.argv[1], "r")
for content_file in zf.infolist():
sys.stdout.write("%s\0" % (content_file.filename,))
' "$@"
}
### Function: Extract a single file's contents from a zip file
read_file() {
python -c '
import sys, zipfile, zlib
zf = zipfile.ZipFile(sys.argv[1], "r")
sys.stdout.write(zf.read(sys.argv[2]))
' "$@"
}
### Main loop
process_zip_contents() {
local zipfile=$1
while IFS= read -r -d '' filename; do
printf 'Started file: %q\n' "$filename"
while IFS= read -r line; do
printf ' Read line: %s\n' "$line"
done < <(read_file "$zipfile" "$filename" | gunzip -c)
done < <(list_files "$zipfile")
}
要对上述内容进行烟雾测试——如果输入文件创建如下:
printf '%s\n' '1: line one' '1: line two' '1: line three' | gzip > one.gz
printf '%s\n' '2: line one' '2: line two' '2: line three' | gzip > two.gz
cp one.gz 'name
with
newline.gz'
zip test.zip one.gz two.gz $'name\nwith\nnewline.gz'
process_zip_contents test.zip
...那么我们有以下输出:
Started file: $'name\nwith\nnewline.gz'
Read line: 1:line one
Read line: 1:line two
Read line: 1:line three
Started file: one.gz
Read line: 1: line one
Read line: 1: line two
Read line: 1: line three
Started file: two.gz
Read line: 2: line one
Read line: 2: line two
Read line: 2: line three