【发布时间】:2018-12-19 18:07:37
【问题描述】:
对于我在 bash 中比较两个 cvs 文件的情况,我需要您的建议:
file1.csv
300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000043.000|15|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|31583000|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000210.000|14|0|49300|1|43|4
300000493|300000323|300000323|300000000|16|0|12619|0|0|+000000000000014.000|16|0|49300|89|42|4
300146897|300146897|300000394|300000000|609|1|12619|0|0|+000000000000020.000|1|0|14689700|7|36|4
file2.csv
300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000053.000|1|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|49300|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000219.000|14|0|49300|1|43|5
diff -y file1.csv file2.csv 命令显示了我正在寻找的类似输出:
300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000043.000|15|0|49300|1|42|4 | 300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000053.000|1|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|31583000|89|43|4 | 300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|49300|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000210.000|14|0|49300|1|43|4 | 300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000219.000|14|0|49300|1|43|5
300000493|300000323|300000323|300000000|16|0|12619|0|0|+000000000000014.000|16|0|49300|89|42|4 <
300146897|300146897|300000394|300000000|609|1|12619|0|0|+000000000000020.000|1|0|14689700|7|36|4 <
但是,我试图获得更高级的输出,用星号 * 标识单元格之间的差异,如果其中一侧不存在整行,则添加破折号 -。最后每边创建一个输出文件(因为之后我要将每个输出 csv 转换为 html 以便将它们嵌入到 html 文件中),例如:
file1.out.csv
300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000043.000*|15|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|31583000*|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000210.000*|14|0|49300|1|43|4*
300000493|300000323|300000323|300000000|16|0|12619|0|0|+000000000000014.000|16|0|49300|89|42|4
300146897|300146897|300000394|300000000|609|1|12619|0|0|+000000000000020.000|1|0|14689700|7|36|4
file2.out.csv
300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000053.000*|1|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|49300*|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000219.000*|14|0|49300|1|43|5*
-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-
-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-
希望你能在这里帮助我。 谢谢!
【问题讨论】:
-
也许可以看看其他工具,例如
meld或tkdiff
标签: bash csv difference