对 csv 文件中的每一列进行排序答案

【问题标题】：Sorting every column in a csv file对 csv 文件中的每一列进行排序
【发布时间】：2014-07-26 05:31:15
【问题描述】：

我有一个包含以下条目的 csv 文件：

    Year,Month,Company A, Company B,Company C, .............Company N
    1990, Jan, 10, 15, 20, , ..........,50
    1990, Feb, 10, 15, 20, , ..........,50

我正在尝试对公司 A 的 csv 文件数据进行排序，依此类推，直到公司 N。

我的代码在循环中第一次运行时运行良好，但在第二次运行时失败。

    try:
        reader = csv.DictReader(open(self.filename,'r')) #Try and open the file with csv dictreader
    except IOError:
        print "Error Opening File -- Check if file exists"

    ncols = reader.next()
    print ncol.keys()
    for key in ncols.keys():
        if key != 'Month' and key != 'Year':
            print key
            result = sorted(reader, key=lambda d: float(d[key]))
            result = result[-1]
            #print "Year " ,
            print result['Year'],
            #print "Month ",
            print result ['Month'],
            print key,
            print result[key]

输出：

    Company-E
    2008 Oct Company-E 997
    Company-D

    Traceback (most recent call last):
    File "<pyshell#105>", line 1, in <module>
    read.ParseData()
    File "C:/Users/prince/Desktop/CsvRead.py", line 55, in ParseData
    result = result[-1]
    IndexError: list index out of range

【问题讨论】：

你是如何排序的？
我想对整个文件中的数据进行排序，以获取一列中的数据，然后是另一列，依此类推。

标签： python csv for-loop dictionary output

【解决方案1】：

我建议使用pandas:

import pandas
df = pandas.read_csv(filename)
for col in df.columns:
    if col != 'Month' and col != 'Year':
        df = df.sort(col)
df.to_csv(out_filename, index=False)

【讨论】：

我不能使用任何不是 python 可执行文件一部分的模块。

【解决方案2】：

代码确实通过添加两行来工作：我需要将文件倒回到初始位置。

fh.seek(0)

fh.next()

这是代码的工作部分：

        actualResult = {}
        try:
            fh = open(filename,'r')
            reader = csv.DictReader(fh) #Try and open the file with csv dictreader

            #Get the field names in the file:
            fields = set(reader.fieldnames)
            if not fields or ('Year' not in fields and 
            'Month' not in fields):
                raise BadInputFile(filename)
            companies = fields - {'Year', 'Month'}
            print companies
            for name in companies:
                #sorting the csv file data based on column data with Company Name as Key
                result = sorted(reader, key=lambda d: float(d[name]), reverse=True)
                result = result[0]
                tup = (result[name],result['Year'],result['Month'])
                if name not in actualResult.keys():
                    actualResult.update({str(name): tup})
                else:
                    raise BadInputFile(filename)
                fh.seek(0) #rewinding the file to initial position
                fh.next()  #Moving to the 1st row
        except (IOError, BadInputFile) as e:
            print "Error: ", str(e) # Invalid input file
            raise


        return actualResult

【讨论】：