【发布时间】:2017-05-24 20:21:39
【问题描述】:
我有 2 个来自同一来源的 .csv 数据集。我试图检查第一个数据集中的任何项目是否仍然存在于第二个数据集中。
#!/usr/bin/python
import csv
import json
import click
@click.group()
def cli(*args, **kwargs):
"""Command line tool to compare and generate a report of item that still persists from one report to the next."""
pass
@click.command(help='Compare the keysets and return a list of keys old keys still active in new keyset.')
@click.option('--inone', '-i', default='keys.csv', help='specify the file of the old keyset')
@click.option('--intwo', '-i2', default='keys2.csv', help='Specify the file of the new keyset')
@click.option('--output', '-o', default='results.json', help='--output, -o, Sets the name of the output.')
def compare(inone, intwo, output):
csvfile = open(inone, 'r')
csvfile2 = open(intwo, 'r')
jsonfile = open(output, 'w')
reader = csv.DictReader(csvfile)
comparator = csv.DictReader(csvfile2)
for line in comparator:
for row in reader:
if row == line:
print('#', end='')
json.dump(row, jsonfile)
jsonfile.write('\n')
print('|', end='')
print('-', end='')
cli.add_command(compare)
if __name__ == '__main__':
cli()
假设每个 csv 文件中有 20 个项目。它目前将迭代 40 次并在我期望它迭代 400 次并创建剩余项目报告时结束。
除了迭代之外的一切似乎都在工作。有人对更好的方法有想法吗?
【问题讨论】:
标签: python-3.x loops csv iterator iterate