【问题标题】:Python sort and sum CSVPython 对 CSV 进行排序和求和
【发布时间】:2016-07-01 04:52:42
【问题描述】:

我有这样的 CSV 文件:

日期时间、使用情况 1、项目 1
日期时间、用法 2、项目 1
日期时间、Usage3、Project2
日期时间、用法4、项目3

目标是总结每个项目的使用情况并生成如下报告:

项目1: 用法1 用法2

项目2: 用法3

项目3: 用法4

我从以下 Python 代码开始,但是它不能正常工作:

#/usr/bin/python

# obtain all Project values into new list project_tags:

project_tags = []
ifile = open("file.csv","r")
reader = csv.reader(ifile)
headerline = ifile.next()
for row in reader:
    project_tags.append(str(row[2]))
ifile.close()

# obtain sorted and unique list and put it into a new list project_tags2
project_tags2 = []
for p in list(set(project_tags)):
    project_tags2.append(p)


# open CSV file again and compare it with new unique list
ifile2 = open("file.csv","r")
reader2 = csv.reader(ifile2)
headerline = ifile2.next()

# Loop through both new list and a CSV file, and if they matches sum it:

sum_per_project = sum_per_project + int(row[29])
for project in project_tags2:
    for row in reader2:
        if row[2] == project:
            sum_per_project = sum_per_project + int(row[1])

感谢任何输入!

提前致谢。

【问题讨论】:

    标签: python csv datetime


    【解决方案1】:

    尝试以下 sn-p:

    summary = {}
    
    with open("file.csv", "r") as fp:
        for line in fp:
            row = line.rstrip().split(',')
    
            key = row[2]
            if key in summary:
                summary[key] += (row[1].strip(),)
            else:
                summary[key] = (row[1].strip(),)
    
    for k in summary:
        print('{0}: {1}'.format(k, ' '.join(summary[k])))
    

    根据您在 csv 文件中的示例数据,它将打印:

     Project1: Usage1 Usage2
     Project2: Usage3
     Project3: Usage4
    

    【讨论】:

      【解决方案2】:

      这是一个带有defaultdict 的方法。

      编辑: 感谢@Saleem 提醒我with 子句,我们只需要输出内容

      from collections import defaultdict
      import csv
      
      summary = defaultdict(list)
      with open(path, "r") as f:
          rows = csv.reader(f)
          header = rows.next()
          for (dte, usage, proj) in rows:
              summary[proj.strip()]+=[usage.strip()]
      
      # I just realized that all you needed to do was output them:
      for proj, usages in sorted(summary.iteritems()):
          print(
              "%s: %s" % (proj, ' '.join(sorted(usages)))
          )
      

      将打印

      Project1: Usage1 Usage2
      Project2: Usage3
      Project3: Usage4
      

      【讨论】:

        猜你喜欢
        • 2011-01-06
        • 2017-06-23
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-06-02
        • 1970-01-01
        • 2017-11-11
        • 1970-01-01
        相关资源
        最近更新 更多