【问题标题】:Sorting every column in a csv file对 csv 文件中的每一列进行排序
【发布时间】:2014-07-26 05:31:15
【问题描述】:

我有一个包含以下条目的 csv 文件:

    Year,Month,Company A, Company B,Company C, .............Company N
    1990, Jan, 10, 15, 20, , ..........,50
    1990, Feb, 10, 15, 20, , ..........,50

我正在尝试对公司 A 的 csv 文件数据进行排序,依此类推,直到公司 N。

我的代码在循环中第一次运行时运行良好,但在第二次运行时失败。

    try:
        reader = csv.DictReader(open(self.filename,'r')) #Try and open the file with csv dictreader
    except IOError:
        print "Error Opening File -- Check if file exists"

    ncols = reader.next()
    print ncol.keys()
    for key in ncols.keys():
        if key != 'Month' and key != 'Year':
            print key
            result = sorted(reader, key=lambda d: float(d[key]))
            result = result[-1]
            #print "Year " ,
            print result['Year'],
            #print "Month ",
            print result ['Month'],
            print key,
            print result[key]

输出:

    Company-E
    2008 Oct Company-E 997
    Company-D

    Traceback (most recent call last):
    File "<pyshell#105>", line 1, in <module>
    read.ParseData()
    File "C:/Users/prince/Desktop/CsvRead.py", line 55, in ParseData
    result = result[-1]
    IndexError: list index out of range

【问题讨论】:

  • 你是如何排序的?
  • 我想对整个文件中的数据进行排序,以获取一列中的数据,然后是另一列,依此类推。

标签: python csv for-loop dictionary output


【解决方案1】:

我建议使用pandas:

import pandas
df = pandas.read_csv(filename)
for col in df.columns:
    if col != 'Month' and col != 'Year':
        df = df.sort(col)
df.to_csv(out_filename, index=False)

【讨论】:

  • 我不能使用任何不是 python 可执行文件一部分的模块。
【解决方案2】:

代码确实通过添加两行来工作: 我需要将文件倒回到初始位置。

fh.seek(0)

fh.next()

这是代码的工作部分:

        actualResult = {}
        try:
            fh = open(filename,'r')
            reader = csv.DictReader(fh) #Try and open the file with csv dictreader

            #Get the field names in the file:
            fields = set(reader.fieldnames)
            if not fields or ('Year' not in fields and 
            'Month' not in fields):
                raise BadInputFile(filename)
            companies = fields - {'Year', 'Month'}
            print companies
            for name in companies:
                #sorting the csv file data based on column data with Company Name as Key
                result = sorted(reader, key=lambda d: float(d[name]), reverse=True)
                result = result[0]
                tup = (result[name],result['Year'],result['Month'])
                if name not in actualResult.keys():
                    actualResult.update({str(name): tup})
                else:
                    raise BadInputFile(filename)
                fh.seek(0) #rewinding the file to initial position
                fh.next()  #Moving to the 1st row
        except (IOError, BadInputFile) as e:
            print "Error: ", str(e) # Invalid input file
            raise


        return actualResult

【讨论】:

    猜你喜欢
    • 2016-03-16
    • 2017-04-16
    • 2015-12-24
    • 2013-07-10
    • 2014-12-02
    • 2021-09-04
    • 2015-09-07
    • 1970-01-01
    • 2021-03-16
    相关资源
    最近更新 更多