【问题标题】:Merge dictionaries if of the object key values is the same in python如果对象键值在python中相同,则合并字典
【发布时间】:2020-05-03 14:46:18
【问题描述】:

我需要根据字典列表中的对象合并一些字典的帮助。这可能吗?

我的数据:

mongo_data = [{
 'url': 'https://goodreads.com/',
 'variables': [{'key': 'Harry Potter', 'value': '10.0'},
               {'key': 'Discovery of Witches', 'value': '8.5'},],
 'vendor': 'Fantasy' 
 },{
 'url': 'https://goodreads.com/',
 'variables': [{'key': 'Hunger Games', 'value': '10.0'},
               {'key': 'Maze Runner', 'value': '5.5'},],
 'vendor': 'Dystopia' 
 },{
 'url': 'https://kindle.com/',
 'variables': [{'key': 'Twilight', 'value': '5.9'},
               {'key': 'Lord of the Rings', 'value': '9.0'},],
 'vendor': 'Fantasy' 
 },{
 'url': 'https://kindle.com/',
 'variables': [{'key': 'The Handmaids Tale', 'value': '10.0'},
               {'key': 'Divergent', 'value': '9.0'},],
 'vendor': 'Fantasy' 
 }]

我的代码:

我使用 [groupby] 将具有相同 URL 的项目组合在一起。

from itertools import groupby, chain
import json

searches = []
for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
    search = {}
    search["url"] = key
    search["results"] = [{"genre": result["vendor"], "data": result["variables"]} for result in group]
    searches.append(search)

print(json.dumps(searches))

我的输出

[
  {
    "url": "https://goodreads.com/",
    "results": [
      {
        "genre": "Fantasy",
        "data": [
          {
            "key": "Harry Potter",
            "value": "10.0"
          },
          {
            "key": "Discovery of Witches",
            "value": "8.5"
          }
        ]
      },
      {
        "genre": "Dystopia",
        "data": [
          {
            "key": "Hunger Games",
            "value": "10.0"
          },
          {
            "key": "Maze Runner",
            "value": "5.5"
          }
        ]
      }
    ]
  },
  {
    "url": "https://kindle.com/",
    "results": [
      {
        "genre": "Fantasy",
        "data": [
          {
            "key": "Twilight",
            "value": "5.9"
          },
          {
            "key": "Lord of the Rings",
            "value": "9.0"
          }
        ]
      },
      {
        "genre": "Fantasy",
        "data": [
          {
            "key": "The Handmaids Tale",
            "value": "10.0"
          },
          {
            "key": "Divergent",
            "value": "9.0"
          }
        ]
      }
    ]
  }
]

正如您在https://kindle.com/ 下看到的那样,我有两次"genre":"Fantasy"。而不是打印两次。我可以在没有重复的情况下合并它们吗?

所以我希望我的预期结果是:

{
    "url": "https://kindle.com/",
    "results": [
      {
        "genre": "Fantasy",
        "data": [
          {
            "key": "Twilight",
            "value": "5.9"
          },
          {
            "key": "Lord of the Rings",
            "value": "9.0"
          },
          {
            "key": "The Handmaids Tale",
            "value": "10.0"
          },
          {
            "key": "Divergent",
            "value": "9.0"
          }
        ]
      }
    ]
  }
]

这可能吗?

【问题讨论】:

    标签: python python-3.x dictionary object arraylist


    【解决方案1】:

    您需要第二个 groupby 来按供应商对结果进行分组。

    例如:

    searches = []
    for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
        search = {"url": key, "results": []}
        for vendor, group2 in groupby(group, key=lambda chunk2: chunk2['vendor']):
            result = {
                "genre": vendor,
                "data": [{"key": key, "value": value}
                         for result2 in group2
                         for key, value in result2["variables"]],
            }
            search["results"].append(result)
        searches.append(search)
    

    理解列表用于展平result2["variables"] 并避免列表列表。

    结果是:

    [
     {
      "url": "https://goodreads.com/",
      "results": [
       {
        "genre": "Fantasy",
        "data": [
         {
          "key": "key",
          "value": "value"
         },
         {
          "key": "key",
          "value": "value"
         }
        ]
       },
       {
        "genre": "Dystopia",
        "data": [
         {
          "key": "key",
          "value": "value"
         },
         {
          "key": "key",
          "value": "value"
         }
        ]
       }
      ]
     },
     {
      "url": "https://kindle.com/",
      "results": [
       {
        "genre": "Fantasy",
        "data": [
         {
          "key": "key",
          "value": "value"
         },
         {
          "key": "key",
          "value": "value"
         },
         {
          "key": "key",
          "value": "value"
         },
         {
          "key": "key",
          "value": "value"
         }
        ]
       }
      ]
     }
    ]
    

    【讨论】:

    • 谢谢,有什么办法可以取出重复的吗?
    【解决方案2】:

    如果你想要一个“单行”(?),试试这个:

    {"url": "https://kindle.com/", "results": [{"genre": k,"data": [v]} for k, v in {g:[y for x in [x['variables'] for x in mongo_data if x['vendor'] == g] for y in x] for g in set(x['vendor'] for x in mongo_data)}.items()]}
    

    它产生

    {
        'url': 'https://kindle.com/',
        'results': [
            {
                'genre': 'Fantasy',
                'data': [
                    [
                        {'key': 'Harry Potter', 'value': '10.0'},
                        {'key': 'Discovery of Witches', 'value': '8.5'},
                        {'key': 'Twilight', 'value': '5.9'},
                        {'key': 'Lord of the Rings', 'value': '9.0'},
                        {'key': 'The Handmaids Tale', 'value': '10.0'},
                        {'key': 'Divergent', 'value': '9.0'}
                    ]
                ]
            },
    
            {
                'genre': 'Dystopia',
                'data': [
                    [
                        {'key': 'Hunger Games', 'value': '10.0'},
                        {'key': 'Maze Runner', 'value': '5.5'}
                    ]
                ]
            }
        ]
    }
    

    【讨论】:

    • 我仍然需要先将它们按 url 分组,然后再按流派分组。只是不是类型。所以在kindle下我希望幻想类型合并。
    • 抱歉会修复 :)
    • 我有多个这样的集合。我不想硬编码像 kindle.com 这样的值
    【解决方案3】:

    您可以在 for 循环之后使用此代码来完成您提到的内容:

    from collections import defaultdict
    
    for item in searches:
        results = item['results']
        _res = defaultdict(list)
        for r in results:
            _res[r['genre']].append(r['data'])
    
        item['data'] = [{
            'genre': k,
            'data': _res[k]
        } for k in _res.keys()]
    
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-02-24
      • 1970-01-01
      • 1970-01-01
      • 2022-12-12
      • 1970-01-01
      相关资源
      最近更新 更多