在python中迭代合并两个CSV文件答案

【问题标题】：Merge two CSV files in python iteratively在python中迭代合并两个CSV文件
【发布时间】：2015-10-09 22:15:11
【问题描述】：

我有一组数据保存在具有固定列数的多个 .csv 文件中。每列对应一个不同的测量值。

我想为每个文件添加一个标题。所有文件的标题都是相同的，由三行组成。其中两行用于标识它们对应的列。

我想我可以将标头保存在单独的 .csv 文件中，然后使用 for 循环将其与每个数据文件迭代合并。

如何在 python 中做到这一点？我是语言新手。

【问题讨论】：

标签： python csv merge header

【解决方案1】：

是的，您可以使用 pandas 轻松做到这一点。这将比您当前可能会产生问题的想法更快、更容易。

三个简单的命令将用于读取、合并并将其放入新文件中，它们是：

pandas.read_csv()
pandas.merge()
pandas.to_csv()

您可以阅读必须使用的参数以及有关它们的更多详细信息here.

【讨论】：

【解决方案2】：

for your case you may need first to create new files with
the headers with them. then you would do another loop to
add the rows, but skipping the header. 

import csv
with open("data_out.csv","a") as fout:
    # first file:
    with open("data.csv") as f: # you header file
        for line in f:
            fout.write(line)

    with open("data_2.csv") as f:
        next(f)        # this will skip first line
        for line in f:
          fout.write(line)

【讨论】：

【解决方案3】：

与其运行一个为多个文件附加两个文件的 for 循环，一个更简单的解决方案是将要合并的所有 csv 文件放入一个文件夹并将路径提供给程序。这会将所有 csv 文件合并为一个 csv 文件。（注意：每个文件的属性必须相同）

import os
import pandas as pd

#give the path to the folder containing the multiple csv files
dirList = os.listdir(path)

#Put all their names into a list
filenames = []
for item in dirList:
    if ".csv" in item:
        filenames.append(item) 

#Create a dataframe and make sure it's empty (not required but safe practice if using for appending)
df1 = pd.Dataframe()
df1.drop(df1.index, inplace=True)

#Convert each file to a dataframe and append it to dataframe df1
for f in filenames:
    df = pd.read_csv(f)
    df1 = df1.append(df)

#Convert the dataframe into a single csvfile
df1.to_csv(csvfile, encoding='utf-8', index=False)

【讨论】：