PYTHON - 对字典列表进行分组答案

【问题标题】：PYTHON - group a list of dictPYTHON - 对字典列表进行分组
【发布时间】：2021-05-02 14:13:33
【问题描述】：

在 Python3 中是否有一种简单的方法可以通过键对 dict 列表进行分组我有一个复杂的输入列表要格式化

我的输入对象是这样的：

my_input = [
  {
    'name': 'nameA',
    'departments': [
      {
        'name': 'dep1',
        'details': [
          {
            'name': 'name_detA',
            'tech_name': 'techNameA',
            'others': None,
            'sub_details': []
          },
          {
            'name': 'name_detB',
            'tech_name': 'techNameB',
            'others': 22,
            'sub_details': [
             {
                'id': 'idB',
                'column2': 'ZZ',
                'column3': 'CCC',
                'column4': {
                  'id': 'id2',
                  'subColumn1': 'HHH',
                  'subColumn1': 'PPPP',
                  'subColumn1': 'FFFFFF'
                }
              }
            ]
          },
          
          
          {
            'name': 'name_detB',
            'tech_name': 'techNameB',
            'others': 22,
            'sub_details': [
              {
                'id': 'idA',
                'column2': 'AA',
                'column3': 'BBB',
                'column4': {
                  'id': 'id1',
                  'subColumn1': 'XXXX',
                  'subColumn1': 'YYYYY',
                  'subColumn1': 'DDDDDD'
                }
              }
            ]
          }
          
        ]
      }
    ]
  }
]

我的目标是将具有相同details['techName'] 的元素组合成一个元素并合并它们的sub_details

预期输出：

my_output = [
  {
    "name": "nameA",
    "departments": [
      {
        "name": "dep1",
        "details": [
          {
            "name": "name_detA",
            "tech_name": "techNameA",
            "others": None,
            "sub_details": []
          },
          {
            "name": "name_detB",
            "tech_name": "techNameB",
            "others": 22,
            "sub_details": [
             {
                "id": "idB",
                "column2": "ZZ",
                "column3": "CCC",
                "column4": {
                  "id": "id2",
                  "subColumn1": "HHH",
                  "subColumn1": "PPPP",
                  "subColumn1": "FFFFFF"
                }
              },
              {
                "id": "idA",
                "column2": "AA",
                "column3": "BBB",
                "column4": {
                  "id": "id1",
                  "subColumn1": "XXXX",
                  "subColumn1": "YYYYY",
                  "subColumn1": "DDDDDD"
                }
              }
            ]
          }
        ]
      }
    ]
  }
]

我试过了：

result_list = []
sub = []
for elem in my_input:
    for data in elem["departments"]:
        for sub_detail, dicts_for_that_sub in itertools.groupby(data["details"], key=operator.itemgetter("sub_details")):
            sub.append({"sub_details": sub_detail})
        print(sub)

但我正在努力创建新的输出

【问题讨论】：

在sub_details、column4 的示例数据中，关键始终是subColumn1，这是您想要的吗？

标签： python python-3.x list group-by

【解决方案1】：

假设我在这里使用的输入是您真正想要的，那么您就在正确的轨道上。我将最里面的 for 循环重新实现为对方法的调用，但这并不是绝对需要的。

我可能会对使用setdefault() 而不是if/else 的merge_details() 方法采取稍微不同的方法，但如果您以前没有使用过setdefault()，这种方法更容易遵循。

import json 只是为了让打印做一些“不错”的事情，而不是作为解决方案的一部分。

import json

my_input = [
  {
    "name": "nameA",
    "departments": [
      {
        "name": "dep1",
        "details": [
          {
            "name": "name_detB",
            "tech_name": "techNameB",
            "others": 22,
            "sub_details": [
             {
                "id": "idB",
                "column2": "ZZ",
                "column3": "CCC",
                "column4": {
                  "id": "id2",
                  "subColumn1": "HHH",
                  "subColumn2": "PPPP",
                  "subColumn3": "FFFFFF"
                }
              }
            ]
          },
          {
            "name": "name_detA",
            "tech_name": "techNameA",
            "others": None,
            "sub_details": []
          },
          {
            "name": "name_detB",
            "tech_name": "techNameB",
            "others": 22,
            "sub_details": [
              {
                "id": "idA",
                "column2": "AA",
                "column3": "BBB",
                "column4": {
                  "id": "id1",
                  "subColumn1": "XXXX",
                  "subColumn2": "YYYYY",
                  "subColumn3": "DDDDDD"
                }
              }
            ]
          }
        ]
      }
    ]
  }
]

def merge_details(details):
    ## --------------------
    ## dict to hold details by key (tech_name)
    keyed_details = {}
    ## --------------------

    ## --------------------
    ## for each each "detail" if we find it in the key_detail merge the
    ## sub_details lists otherwise add it as the value of the key
    ## --------------------
    for detail in details:
        key = detail["tech_name"]
        if keyed_details.get(key):
            keyed_details[key]["sub_details"].extend(detail["sub_details"])
        else:
            keyed_details[key] = detail
    ## --------------------

    return list(keyed_details.values())

for elem in my_input:
    for department in elem["departments"]:
        department["details"] = merge_details(department["details"])

print(json.dumps(my_input, indent=4, sort_keys=True))

【讨论】：

这正是我想要实现的！谢谢大佬