编写 csv，然后检查列中的值并写入其他数据答案

【问题标题】：Writing a csv and then checking values in columns and writing additional data编写 csv，然后检查列中的值并写入其他数据
【发布时间】：2019-11-11 08:40:20
【问题描述】：

到目前为止，我已经编写了一个长长的 ID 号列表（约 45000 行）以及 csv 文件的附加参考值。数据结构如下：

12345678 | 2
56789012 | 10
90123456 | 46
...

到目前为止，我为此编写的代码如下所示：

def list_writer():
    with open (csv_dir + '/' + csv_filename, mode = "w", newline='') as csvfile:
        writer = csv.writer(csvfile, lineterminator='\n', delimiter=';')
        for row in ID_list:
            writer.writerow(row)

list_writer()

每个 ID 号（左列）都与一个范围为 1-100 的参考号（右列）相关联。我有几个附加列表，它们将每个参考编号与附加信息（价格、数量等）相关联。

我现在的目标是遍历我编写的长 csv 文件的第二列中的所有参考编号，并将附加属性写入下一列。我已经对 StackExchange 进行了一些挖掘，但到目前为止没有任何效果。提前致谢！

【问题讨论】：

将 100 个参考编号中的每一个与我提到的其他属性相关联的列表已经存储为 Python 中的列表。

标签： python list csv

【解决方案1】：

这听起来像是我在关系（即 SQL）数据库中所做的事情，那里有很多工具可以验证您的数据并确保它们保持一致

如果您想在 Python 中执行此操作，您可以执行以下操作：

# put your "lists of prices" into a dictionary, keyed by the reference number
# assuming the prices is in the form [(ref1, price1), (ref2, price2)]
ref_prices = {}
for ref, price in PRICE_list:
  ref_prices[ref] = price

# do the same for each additional list:
# shorter syntax than the above
ref_quantity = {ref: qty for ref, qty in QTY_list}

# combine all of the above and write into a file
with open(filename, 'w') as fd:
  out = csv.writer(fd, delimiter=';')
  for id, ref in ID_list:
    out.writerow((id, ref, ref_prices[ref], ref_quantity[ref]))

【讨论】：

【解决方案2】：

这是一个完美的 SQL 用例。如果你想在 Python 中实现类似 SQL 的函数，使用pandas 通常是一个好主意。它方便、易写易读、快捷。对于您的情况，假设附加值将存储在元组列表或字典中：

import pandas as pd


csv = [
    (1, 10),
    (2, 20),
    (3, 30),
]

csv_df = pd.DataFrame(csv, columns=["id", "reference"])

# This would be the data you have in your csv. For actually loading them from your 
# csv located at `filepath`, use 
#
#      pd.DataFrame.read_csv(filepath)

additional_data = [
    (1, "a"),
    (2, "b"),
    (3, "c"),
]  # This could also be a dictionary

additional_df = pd.DataFrame(additional_data, columns=["id", "name"])

final_df = csv_df.merge(additional_df, on="id")

然后我们得到

>>> final_df
   id  reference name
0   1         10    a
1   2         20    b
2   3         30    c

【讨论】：

感谢您的回复 - 不幸的是，我在 ANSA 中使用 Python，并且此处没有 pandas 模块，因此我需要在 Python 中执行此操作。
@najusten 如果你想使用 pandas，你必须先安装它。我不与 ANSA 合作，但这篇文章建议您也可以在那里安装第三个外部软件包：qd-eng.de/index.php/2018/03/02/…