【问题标题】:Consolidating data based on two conditions基于两个条件合并数据
【发布时间】:2017-06-20 15:31:08
【问题描述】:

我有四列数据,我试图根据两个条件进行合并。数据格式如下:

CountyName  Year    Oil Gas
ANDERSON    2010    1358    0
ANDERSON    2010    621746  4996766
ANDERSON    2011    1587    0
ANDERSON    2011    633120  5020877
ANDERSON    2012    55992   387685
ANDERSON    2012    1342    0
ANDERSON    2013    635572  3036578
ANDERSON    2013    4873    0
ANDERSON    2014    656440  2690333
ANDERSON    2014    12332   0
ANDERSON    2015    608454  2836272
ANDERSON    2015    23339   0
ANDERSON    2016    551728  2682261
ANDERSON    2016    12716   0
ANDERSON    2017    132466  567874
ANDERSON    2017    1709    0
ANDREWS 2010    25701725    1860063
ANDREWS 2010    106351  0
ANDREWS 2011    97772   0
ANDREWS 2011    28818329    1377865
ANDREWS 2012    105062  0
...

我有兴趣将重复条目的相应石油值和天然气值组合起来。例如,我想添加安德森县 2010 年的所有石油条目,并让该值仅替换一行中的现有条目。我现在使用的代码是对各个县的所有值求和,无论年份如何,给我一个这样的浓缩输出:

CountyName  Year    Oil Gas
ANDERSON        3954774 
ANDREWS      206472698  
...

这是我正在使用的代码:

import csv
with open('Texas.csv', 'r') as Texas: #opening Texas csv file
    TexasReader = csv.reader(Texas)
    counties = {}
    years = {}

    index = 0 and 1
    for row in TexasReader:
        if index == 0 and 1:
            header = row
        else:
            county = row[0]
            year = row[1]
            oil = row[2]
            gas = row[3]

            if county in counties: 
                counties[county] += int(oil) 
       else:
               counties[county] = int(oil)
        index += 1

    with open('TexasConsolidated.csv', 'w') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=header, delimiter=',', lineterminator='\n')
        writer.writeheader()
        for k, v in counties.items():
            writer.writerow({header[0]: k, header[2]: v})

【问题讨论】:

    标签: python csv consolidation


    【解决方案1】:

    这就是你抱怨的那一行:

    if county in counties: 
        counties[county] += int(oil) 
    

    如果您想要一个在两个键上存储总和的dict,那么这两个值都需要在dict 键中。

    添加行

    counties_years = {}
    

    然后像这样求和,使用元组 (county,year) 作为键:

    if (county,year) in counties_years: 
        counties_years[(county,year)] += int(oil) 
    else:
        counties_years[(county,year)] = int(oil)
    

    【讨论】:

      猜你喜欢
      • 2021-09-27
      • 1970-01-01
      • 2018-01-30
      • 2019-05-01
      • 1970-01-01
      • 2023-03-09
      • 2018-06-18
      • 2013-03-28
      • 1970-01-01
      相关资源
      最近更新 更多