【发布时间】:2018-12-16 12:16:16
【问题描述】:
我在/tmp/ 目录中有两个CSV 文件。
一个 CSV 文件结果来自 python 结果,第二个 CSV 文件是要匹配的主文件。
>>> import json
>>> resp = { "status":"success", "msg":"", "data":[ { "website":"https://www.blahblah.com", "severity":"low", "location":"unknown", "asn_number":"AS4134 Chinanet", "longitude":121.3997000000, "epoch_timestamp":1530868957, "id":"c1e15eccdd1f31395506fb85" }, { "website":"https://www.jhonedoe.co.uk/sample.pdf", "severity":"low", "location":"unknown", "asn_number":"AS4134 Chinanet", "longitude":120.1613998413, "epoch_timestamp":1530868957, "id":"933bf229e3e95a78d38223b2" } ] }
>>> response = json.loads(json.dumps(resp))
>>> KEYS = 'website', 'asn_number' , 'severity'
>>> x = []
>>> for attribute in response['data']:
... csv_response = ','.join(attribute[key] for key in KEYS)
... with open('/tmp/processed_results.csv', 'a') as score:
... score.write(csv_response + '\n')
$cat processed_results.csv
https://www.blahblah.com,AS4134 Chinanet,low
https://www.jhonedoe.co.uk/sample.pdf,AS4134 Chinanet,low
要匹配的元文件。
$cat master_meta.csv
http://download2.freefiles-10.de,AS24940 Hetzner Online GmbH,high
https://www.jhonedoe.co.uk/sample.pdf,AS4134 Chinanet,low
http://download2.freefiles-11.de,AS24940 Hetzner Online GmbH,high
www.solener.com,AS20718 ARSYS INTERNET S.L.,low
https://www.blahblah.com,AS4134 Chinanet,low
www.telewizjairadio.pl,AS29522 Krakowskie e-Centrum Informatyczne JUMP Dziedzic,high
我知道如何使用grep 来比较两个文件并获得匹配的行。
$grep -Ff processed_results.csv master_meta.csv
https://www.jhonedoe.co.uk/sample.pdf,AS4134 Chinanet,low
https://www.blahblah.com,AS4134 Chinanet,low
关于如何使用pythonsubprocess call 传递grep/sed/awk 命令来比较两个文件并在变量中获取匹配行的任何建议?
【问题讨论】:
-
既然可以使用
re模块,为什么还要从 Python 调用 bash 命令?
标签: python bash awk sed subprocess