【问题标题】:Counting the repetition of values in a python dictionary计算python字典中值的重复次数
【发布时间】:2013-03-14 06:15:11
【问题描述】:

我有一本以下格式的字典。在这本词典中,存在不同类型的区域,但多次出现。我想从中生成另一个字典,其中将包含一个附加键“Count”,并且该键将包含重复区域(即“Full Run or Half Run or Semi Run”)的次数。

[
{'zip_zone': u'Full Run', 'zipcode': u'14042', 'longitude': -78.516154}, 
{'zip_zone': u'Full Run', 'zipcode': u'14101', 'longitude': -78.51734}, 
{'zip_zone': u'Full Run', 'zipcode': u'14706', 'longitude': -78.493761}, 
{'zip_zone': u'Half Run', 'zipcode': u'14709', 'longitude': -78.024817}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14711', 'longitude': -78.119974}, 
{'zip_zone': u'Full Run', 'zipcode': u'14714', 'longitude': -78.256921}, 
{'zip_zone': u'Half Run', 'zipcode': u'14715', 'longitude': -78.157392}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14717', 'longitude': -78.210567}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14719', 'longitude': -78.86951}, 
{'zip_zone': u'Half Run', 'zipcode': u'14727', 'longitude': -78.268103}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14731', 'longitude': -78.658909}, 
{'zip_zone': u'Half Run', 'zipcode': u'14735', 'longitude': -78.087607}, 
{'zip_zone': None, 'zipcode': u'14737', 'longitude': -78.431625}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14739', 'longitude': -78.139046}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14741', 'longitude': -78.5907}, 
{'zip_zone': u'Special Run', 'zipcode': u'14743', 'longitude': -78.4098}, 
{'zip_zone': u'Special Run', 'zipcode': u'14744', 'longitude': -78.167853}, 
{'zip_zone': u'Half Run', 'zipcode': u'14748', 'longitude': -78.639987}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14753', 'longitude': -78.640416}, 
{'zip_zone': u'Special Run', 'zipcode': u'14754', 'longitude': -78.18395}, 
{'zip_zone': u'Special Run', 'zipcode': u'14755', 'longitude': -78.800866}, 
{'zip_zone': u'Half Run', 'zipcode': u'14760', 'longitude': -78.426015}, 
]

输出字典应该看起来像

[
{'zip_zone': u'Full Run', 'zipcode': u'14042', 'longitude': -78.516154, 'count': 4}, 
{'zip_zone': u'Full Run', 'zipcode': u'14101', 'longitude': -78.51734, 'count': 4}, 
{'zip_zone': u'Full Run', 'zipcode': u'14706', 'longitude': -78.493761, 'count': 4}, 
{'zip_zone': u'Half Run', 'zipcode': u'14709', 'longitude': -78.024817, 'count': 6}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14711', 'longitude': -78.119974, 'count': 7}, 
{'zip_zone': u'Full Run', 'zipcode': u'14714', 'longitude': -78.256921, 'count': 4}, 
{'zip_zone': u'Half Run', 'zipcode': u'14715', 'longitude': -78.157392, 'count': 6}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14717', 'longitude': -78.210567, 'count': 7}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14719', 'longitude': -78.86951, 'count': 7}, 
{'zip_zone': u'Half Run', 'zipcode': u'14727', 'longitude': -78.268103, 'count': 6}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14731', 'longitude': -78.658909, 'count': 7}, 
{'zip_zone': u'Half Run', 'zipcode': u'14735', 'longitude': -78.087607, 'count': 6}, 
{'zip_zone': None, 'zipcode': u'14737', 'longitude': -78.431625, 'count': 0}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14739', 'longitude': -78.139046, 'count': 7}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14741', 'longitude': -78.5907, 'count': 7}, 
{'zip_zone': u'Special Run', 'zipcode': u'14743', 'longitude': -78.4098, 'count': 4}, 
{'zip_zone': u'Special Run', 'zipcode': u'14744', 'longitude': -78.167853, 'count': 4}, 
{'zip_zone': u'Half Run', 'zipcode': u'14748', 'longitude': -78.639987, 'count': 6}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14753', 'longitude': -78.640416, 'count': 7}, 
{'zip_zone': u'Special Run', 'zipcode': u'14754', 'longitude': -78.18395, 'count': 4}, 
{'zip_zone': u'Special Run', 'zipcode': u'14755', 'longitude': -78.800866, 'count': 4}, 
{'zip_zone': u'Half Run', 'zipcode': u'14760', 'longitude': -78.426015, 'count': 6}, 
]

【问题讨论】:

  • 如果 count 字段要为 Full Run、Semi Run 或 Half Run 递增,那么,您确定 zip 区域应该包含“Full Run”吗?

标签: python django dictionary


【解决方案1】:

这是 Python 集合模块中 Counter 类的一个很好的用例。

import collections

# u is your input list of dictionaries, entries in u will be modified in place

c = collections.Counter(e["zip_zone"] for e in u)
for e in u:
    e["count"] = c[e["zip_zone"]]

【讨论】:

    【解决方案2】:

    我不太确定您的问题,但以下代码可能会按照您的问题表达:

    input = [
    {'zip_zone': u'Full Run', 'zipcode': u'14042', 'longitude': -78.516154},
    {'zip_zone': u'Full Run', 'zipcode': u'14101', 'longitude': -78.51734},
    {'zip_zone': u'Full Run', 'zipcode': u'14706', 'longitude': -78.493761},
    {'zip_zone': u'Half Run', 'zipcode': u'14709', 'longitude': -78.024817},
    {'zip_zone': u'Semi Run', 'zipcode': u'14711', 'longitude': -78.119974},
    {'zip_zone': u'Full Run', 'zipcode': u'14714', 'longitude': -78.256921},
    {'zip_zone': u'Half Run', 'zipcode': u'14715', 'longitude': -78.157392},
    {'zip_zone': u'Semi Run', 'zipcode': u'14717', 'longitude': -78.210567},
    {'zip_zone': u'Semi Run', 'zipcode': u'14719', 'longitude': -78.86951},
    {'zip_zone': u'Half Run', 'zipcode': u'14727', 'longitude': -78.268103},
    {'zip_zone': u'Semi Run', 'zipcode': u'14731', 'longitude': -78.658909},
    {'zip_zone': u'Half Run', 'zipcode': u'14735', 'longitude': -78.087607},
    {'zip_zone': None, 'zipcode': u'14737', 'longitude': -78.431625},
    {'zip_zone': u'Semi Run', 'zipcode': u'14739', 'longitude': -78.139046},
    {'zip_zone': u'Semi Run', 'zipcode': u'14741', 'longitude': -78.5907},
    {'zip_zone': u'Special Run', 'zipcode': u'14743', 'longitude': -78.4098},
    {'zip_zone': u'Special Run', 'zipcode': u'14744', 'longitude': -78.167853},
    {'zip_zone': u'Half Run', 'zipcode': u'14748', 'longitude': -78.639987},
    {'zip_zone': u'Semi Run', 'zipcode': u'14753', 'longitude': -78.640416},
    {'zip_zone': u'Special Run', 'zipcode': u'14754', 'longitude': -78.18395},
    {'zip_zone': u'Special Run', 'zipcode': u'14755', 'longitude': -78.800866},
    {'zip_zone': u'Half Run', 'zipcode': u'14760', 'longitude': -78.426015},
    ];
    output=[];
    zipZoneCnt={};
    for item in input:
            if item['zip_zone'] in zipZoneCnt.keys():
                    zipZoneCnt[item['zip_zone']]+=1;
            else:
                    zipZoneCnt[item['zip_zone']]=1;
    zipZoneCnt[None]=0;
    for item in input:
            item['count']=zipZoneCnt[item['zip_zone']];
    print zipZoneCnt;
    for item in input:
            print item;
    

    【讨论】:

      【解决方案3】:

      collections.Counter 来救援。

      from collections import Counter
      a = [
       {'zip_zone': u'Full Run', 'zipcode': u'14042', 'longitude': -78.516154}, 
       {'zip_zone': u'Full Run', 'zipcode': u'14101', 'longitude': -78.51734}, 
       {'zip_zone': u'Full Run', 'zipcode': u'14706', 'longitude': -78.493761}, 
       {'zip_zone': u'Half Run', 'zipcode': u'14709', 'longitude': -78.024817}, 
       {'zip_zone': u'Semi Run', 'zipcode': u'14711', 'longitude': -78.119974},
      ]
      
      # to obtain the counts:
      c = Counter( x['zip_zone'] for x in a )
      c
      = Counter({u'Full Run': 3, u'Semi Run': 1, u'Half Run': 1})
      
      # to update original structure in place:
      for x in a:
           x['count'] = c[x['zip_zone']]
      
      a
      
      [{'count': 3,
        'longitude': -78.516154,
        'zip_zone': u'Full Run',
        'zipcode': u'14042'},
       {'count': 3,
        'longitude': -78.51734,
        'zip_zone': u'Full Run',
        'zipcode': u'14101'},
       {'count': 3,
        'longitude': -78.493761,
        'zip_zone': u'Full Run',
        'zipcode': u'14706'},
       {'count': 1,
        'longitude': -78.024817,
        'zip_zone': u'Half Run',
        'zipcode': u'14709'},
       {'count': 1,
        'longitude': -78.119974,
        'zip_zone': u'Semi Run',
        'zipcode': u'14711'}]
      

      【讨论】:

        【解决方案4】:

        可能不是很漂亮但是你可以尝试使用defaultdict:

        from collections import defaultdict
        
        output = defaultdict(list)
        
        for line in origData:
            output[line['zip_zone']].append(line)
        
        for line in origData:
            line['Count'] = len(output[line['zip_zone']])
        
        print origData
        

        【讨论】:

        • 嘿Artsiom,这里附加的数据是什么意思(数据)
        猜你喜欢
        • 1970-01-01
        • 2018-06-30
        • 1970-01-01
        • 2021-12-04
        • 2019-07-13
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2017-12-24
        相关资源
        最近更新 更多