【发布时间】:2020-08-05 15:01:03
【问题描述】:
我需要解析 rsync 统计信息,例如:
Number of files: 265 (reg: 189, dir: 10, link: 66)
Number of created files: 18
Number of deleted files: 4
Number of regular files transferred: 24
Total file size: 121.67K bytes
Total transferred file size: 0 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 0
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 9.15K
Total bytes received: 33
sent 9.15K bytes received 33 bytes 18.37K bytes/sec
total size is 121.67K speedup is 13.24
使用如下命令解析每一行相当容易:
$(echo "$rawstats" | grep -Po '(?<=Number of files: ).*')
现在我需要解析第一行。我在这里找到了 Perl 解决方案:Perl Parse rsync Output
但我不想依赖 perl,而 Dan Lowe 的答案并非在所有情况下都有效,因为 () 中的内容可能是 reg:、dir:、link: 的任意组合(甚至我忽略的其他内容)。
即:
265 (reg: 189, dir: 10, link: 66)
265 (dir: 10, link: 66)
265 (link: 66)
所以我正在尝试构建正确的正则表达式以传递给 grep -P 到目前为止,我发现:
(\d+) \((?:([a-z]+): (\d+)(?:, )?)*\)?
如下匹配:
[0] is a null string
[1]=265
[2]=link
[3]=66
我预期的结果:
[1]=265
[2]=reg
[3]=189
[4]=dir
[5]=10
[6]=link
[7]=66
我不知道如何改进我的结果。 最好的结果是 bash 关联数组,例如:
[reg]=189
[dir]=10
[link]=66
感谢您的帮助
【问题讨论】:
标签: regex bash parsing grep rsync