【问题标题】:Combine two lists of dicts, adding the values together合并两个字典列表,将值相加
【发布时间】:2018-05-02 14:29:17
【问题描述】:

我想将两个包含多个 dicts 的列表组合成一个新的 dicts 列表,将新的 dicts 附加到最终列表中,并将遇到的 'views' 值加在一起。

a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]

b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
     {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]

期望的输出是:

c = [{'title': 'Learning How to Program', 'views': 8,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 5,'url': '/7XqR', 'slug': 'mastering-programming'},
     {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]

我找到了Is there any pythonic way to combine two dicts (adding values for keys that appear in both)?——但是我不明白如何在我的情况下获得所需的输出,有两个多个字典列表。

【问题讨论】:

    标签: python list dictionary


    【解决方案1】:

    您需要将输入字典转换为(title: count) 对,将它们用作Counter 中的键和值;然后在求和之后,您可以将这些转换回您的旧格式:

    from collections import Counter
    
    summed = sum((Counter({elem['title']: elem['views']}) for elem in a + b), Counter())
    c = [{'title': title, 'views': counts} for title, counts in summed.items()]
    

    演示:

    >>> from collections import Counter
    >>> a = [{'title': 'Learning How to Program', 'views': 1},
    ...      {'title': 'Mastering Programming', 'views': 3}]
    >>> b = [{'title': 'Learning How to Program', 'views': 7},
    ...      {'title': 'Mastering Programming', 'views': 2},
    ...      {'title': 'Programming Fundamentals', 'views': 1}]
    >>> summed = sum((Counter({elem['title']: elem['views']}) for elem in a + b), Counter())
    >>> summed
    Counter({'Learning How to Program': 8, 'Mastering Programming': 5, 'Programming Fundamentals': 1})
    >>> [{'title': title, 'views': counts} for title, counts in summed.items()]
    [{'views': 8, 'title': 'Learning How to Program'}, {'views': 5, 'title': 'Mastering Programming'}, {'views': 1, 'title': 'Programming Fundamentals'}]
    

    这里的目标是每个计数都有一个唯一标识符。如果您的字典更复杂,您需要将整个字典(减去计数)转换为唯一标识符,或者从字典中选择一个值作为该标识符。然后将每个标识符的视图计数相加。

    从您更新的示例中,URL 将是一个很好的标识符。这样您就可以就地收集观看次数:

    per_url = {}
    for entry in a + b:
        key = entry['url']
        if key not in per_url:
            per_url[key] = entry.copy()
        else:
            per_url[key]['views'] += entry['views']
    
    c = per_url.values()  # use list(per_url.values()) on Python 3
    

    这只是使用字典本身(或至少是遇到的第一个字典的副本)来汇总视图计数:

    >>> from pprint import pprint
    >>> a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
    ...      {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]
    >>> b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
    ...      {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
    ...      {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]
    >>> per_url = {}
    >>> for entry in a + b:
    ...     key = entry['url']
    ...     if key not in per_url:
    ...         per_url[key] = entry.copy()
    ...     else:
    ...         per_url[key]['views'] += entry['views']
    ... 
    >>> per_url
    {'/93hB': {'url': '/93hB', 'title': 'Programming Fundamentals', 'slug': 'programming-fundamentals', 'views': 1}, '/4XvR': {'url': '/4XvR', 'title': 'Learning How to Program', 'slug': 'learning-how-to-program', 'views': 8}, '/7XqR': {'url': '/7XqR', 'title': 'Mastering Programming', 'slug': 'mastering-programming', 'views': 5}}
    >>> pprint(per_url.values())
    [{'slug': 'programming-fundamentals',
      'title': 'Programming Fundamentals',
      'url': '/93hB',
      'views': 1},
     {'slug': 'learning-how-to-program',
      'title': 'Learning How to Program',
      'url': '/4XvR',
      'views': 8},
     {'slug': 'mastering-programming',
      'title': 'Mastering Programming',
      'url': '/7XqR',
      'views': 5}]
    

    【讨论】:

    • 我诚挚的道歉 - 我没有在字典中包含我拥有的所有数据 - 你的解决方案听起来非常接近,但是我如何处理单个字典中的多个键/值组合,如现在所示我的问题?
    • @bhux 从剩余值中创建一个元组键;一个 namedtuple 类可以使它更清楚。回家后我会看到更新。
    • 好的,谢谢 - 我对如何完成您所说的内容有点迷茫。当你在家时,我会很感激一个例子:)
    • 哇,除了变量名,我们想出了相同的解决方案。禅宗美好的一天:-)
    • 非常感谢您对这个 Martijn 的帮助!
    【解决方案2】:

    首先,您需要将输入转换为字典,例如

    b = {'Learning How to Program': 7,
         'Mastering Programming': 2,
         'Programming Fundamentals': 1}
    

    之后,应用您找到的解决方案,然后将其转换回字典列表。

    【讨论】:

      【解决方案3】:

      这是一个简单的。遍历所有条目,第一次遇到条目时复制条目,并在后续遇到时添加视图:

      summary = {}    
      for entry in a + b:
          key = entry['url']
          if key not in summary:
              summary[key] = entry.copy()
          else:
              summary[key]['views'] += entry['views']
      c = list(summary.values())
      

      【讨论】:

        【解决方案4】:

        这可能不是最pythonic的解决方案:

        def coalesce(d1,d2):
            combined = [i for i in d1]
            for d in d2:
                found = False
                for itr in combined:          
                    if itr['title'] == d['title']:
                        itr['views'] += d['views']
                        found = True
                        break
                if not found:
                     combined.append(d)
             return combined
        

        【讨论】:

          【解决方案5】:

          非最佳,但有效:

          >>> from collections import Counter
          >>> from pprint import pprint
          >>> a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
          ...      {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]
          >>> b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
          ...      {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
          ...      {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]
          >>> summed = sum((Counter({x['slug']: x['views']}) for x in a+b), Counter())
          >>> c = dict()
          >>> _ = [c.update({x['slug']: x}) for x in a + b]
          >>> _ = [c[x].update({'views': summed[x]}) for x in c.keys()]
          >>> pprint(c.values())
          [{'slug': 'mastering-programming',
            'title': 'Mastering Programming',
            'url': '/7XqR',
            'views': 5},
           {'slug': 'programming-fundamentals',
            'title': 'Programming Fundamentals',
            'url': '/93hB',
            'views': 1},
           {'slug': 'learning-how-to-program',
            'title': 'Learning How to Program',
            'url': '/4XvR',
            'views': 8}]
          

          基于 Martijn 的 Counter 理念,需要进行更多迭代更新 与其他属性的计数器值,假设它们不改变。

          请注意,生成器中有一些“加密”循环...

          【讨论】:

            【解决方案6】:

            一个简单的函数,可以满足任何给定数量的列表的需要:

            import itertools
            from collections import Counter, OrderedDict
            
            def sum_views(*lists):
                views = Counter()
                docs = OrderedDict()  # to preserve input order
                for doc in itertools.chain(*lists):
                    slug = doc['slug']
                    views[slug] += doc['views']
                    docs[slug] = dict(doc)   # shallow copy of original dict
                    docs[slug]['views'] = views[slug]
                return docs.values()
            

            【讨论】:

              【解决方案7】:

              假设您不想将其命名为“标题”和“视图”。更专业的方式是这样写:

                def combing(x):
                   result = {}
                   for i in x:
                      h = i.values()
                      result[h[0]] = result.get(h[0],0)+ h[1]
                   return result
              
              combing([{'item': 'item1', 'amount': 400}, {'item': 'item2', 'amount': 
              300}, {'item': 'item1', 'amount': 750}])
              

              【讨论】:

                猜你喜欢
                • 2022-07-06
                • 1970-01-01
                • 2021-04-10
                • 2021-09-17
                • 2019-03-01
                • 2013-11-02
                • 1970-01-01
                相关资源
                最近更新 更多