【问题标题】:Recursively generate subset of list in python在python中递归生成列表的子集
【发布时间】:2019-04-22 11:33:19
【问题描述】:

我有一个类似于以下内容的 json 文件:

[
  {
     "category1":"0120391123123"
  },
  [
     {
        "subcategory":"0120391123123"
     },
     [
        {
           "subsubcategory":"019301948109"
        },
        [
           {
              "subsubsubcategory":"013904123908"
           },
           [
              {
                 "subsubsubsubcategory":"019341823908"
              }
           ]
        ]
     ]
  ],
  [
     {
        "subcategory2":"0934810923801"
     },
     [
        {
           "subsubcategory2":"09341829308123"
        }
     ]
  ],
  [
     {
        "category2":"1309183912309"
     },
     [
        {
           "subcategory":"10293182094"
        }
     ]
  ]
]

我还有一个我想在原始列表中找到的类别列表。如果 categoryToFind 中存在该类别,我还想查找所有子类别并将其返回。

categoriesToFind = ['019301948109', '1309183912309']

finalCategories = []

def findCategories(currentList, isFirstIteration):
    for x in currentList:
        if type(x) is dict and (next(iter(x.values())) in categoriesToFind or not isFirstIteration):
            finalCategories.append(next(iter(x.values())))
            if len(currentList) < currentList.index(x) + 1:
                findCategories(currentList[currentList.index(x) + 1], False)

findCategories(data, True)

我希望 finalCategories 包含以下内容:

['019301948109', '013904123908', '019341823908', '1309183912309', '10293182094']

【问题讨论】:

  • 原始 json 的写法不一致,因为category1 是根列表中的字典,category2 是嵌套列表中的字典。这是一个错字还是应该是这样的?
  • 是的,这是故意的

标签: python recursion


【解决方案1】:

您可以将递归与生成器一起使用:

categoriesToFind = ['019301948109', '1309183912309']
d = [{'category1': '0120391123123'}, [{'subcategory': '0120391123123'}, [{'subsubcategory': '019301948109'}, [{'subsubsubcategory': '013904123908'}, [{'subsubsubsubcategory': '019341823908'}]]]], [{'subcategory2': '0934810923801'}, [{'subsubcategory2': '09341829308123'}]], [{'category2': '1309183912309'}, [{'subcategory': '10293182094'}]]]
def get_subcategories(_d, _flag):
   flag = None
   for i in _d:
     if isinstance(i, dict):
       _val = list(i.values())[0]
       if _val in categoriesToFind or _flag:
         yield _val
         flag = True
     else:
        yield from get_subcategories(i, _flag or flag)

print(list(get_subcategories(d, False)))

输出:

['019301948109', '013904123908', '019341823908', '1309183912309', '10293182094']

【讨论】:

猜你喜欢
  • 2020-09-17
  • 2013-07-25
  • 1970-01-01
  • 1970-01-01
  • 2017-03-26
  • 1970-01-01
  • 1970-01-01
  • 2013-06-30
  • 1970-01-01
相关资源
最近更新 更多