【问题标题】:Python for loop enumeratePython for 循环枚举
【发布时间】:2019-06-26 12:39:19
【问题描述】:

我正在读取多个 csv 文件并将其合并到一个 csv 文件中。组合数据的预期结果如下所示:

0   4   6   8   10  12
1   2   5   4   2   1  
5   3   0   1   5   10
....

但在下面的代码中,我打算从 0、4、6、8、10、12 开始。

for indx, file in enumerate(files_File1):
    if file.endswith('csv'):  #reading csv filed in the designated folder
        filepath = os.path.join(folder_File1, file) #reading csv filed in the designated folder
        current = pd.read_csv(filepath, header=None) #reading csv filed in the designated folder
        if indx == 0:
            mydata_File1 = current.copy()
            mydata_File1.columns.values[1] = 4
            print(mydata_File1.columns.values)
        else:
            mydata_File1[2*indx+4] = current.iloc[:,1]
            print(mydata_File1.columns.values)

但相反,列从 0、2、4、6、8、10、12 开始的结果如下所示。

0   4   2   6   8   10  12
1   2       5   4   2   1  
5   3       0   1   5   10
....

我不太确定是什么导致了名为“2”的列。

有什么想法吗?

【问题讨论】:

标签: python python-3.x pandas


【解决方案1】:

如果你真的只是想合并 .csv 文件,不需要 panda。

#! python3
import glob

folder_File1 = r"C:\Users\Public\Documents\Python\CombineCSVFiles"
csv_only = r"\*.csv"
files_File1 = glob.glob(f'{folder_File1}{csv_only}')
new_csv = f'{folder_File1}\\newcsv.csv'

lines = []
for file in files_File1:
    with open(file) as filein:
        if filein.name == new_csv:
            pass
        else:
            for line in filein:
                line = line.strip()  # or some other preprocessing
                lines.append(line)  # storing everything in memory!

with open(new_csv, 'w') as out_file:
    out_file.writelines(line + u'\n' for line in lines)

【讨论】:

    【解决方案2】:

    如果出于某种原因您需要 panda,那么这将起作用。您的代码引用 mydata_File1.columns.values 这是列的名称,而不是列中的值。如果这不能回答您的问题,请根据@juanpa.arrivillaga 评论提供更完整的答案。

    #! python3
    import os
    import pandas as pd
    import glob
    
    folder_File1 = r"C:\Users\Public\Documents\Python\CombineCSVFiles"
    csv_only = r"\*.csv"
    files_File1 = glob.glob(f'{folder_File1}{csv_only}')
    new_csv = f'{folder_File1}\\newcsv.csv'
    
    
    mydata_File1 = []
    
    for indx, file in enumerate(files_File1):
        if file == new_csv:
            pass
        else:
            current = pd.read_csv(file, header=None) #reading csv filed in the designated folder
            print (current)
            if indx == 0:
                mydata_File1 = current.copy()
                print(mydata_File1.values)
            else:
                pass
                mydata_File1 = mydata_File1.append(current, ignore_index=True)
                print(mydata_File1.values)
    
    mydata_File1.to_csv(new_csv)
    

    【讨论】:

    • 您好,感谢您整理代码。但是,我没有看到应该由 2*indx+4 命名的列名。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-09-01
    • 2015-12-13
    • 1970-01-01
    • 2010-11-09
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多