【问题标题】:Python: Transform a list of lists into a hierarchical dictionaryPython:将列表列表转换为分层字典
【发布时间】:2020-10-07 00:21:58
【问题描述】:

我有一些基因测序数据如下:

data = [{'sequence': 'gene1__gene2__gene3', 'occurrence': 10},
        {'sequence': 'gene2__gene3', 'occurrence': 5},
        {'sequence': 'gene2', 'occurrence': 2},
        {'sequence': 'gene4', 'occurrence': 4}
       ]

我想将其转换为以下(树状)dictionary 数据结构,其中任何子路径都会告诉我该组基因的共现计数:

tree_dict = {
        'gene1': {'occurrence': 10, 'self': 0, 'children': {'gene2': {'occurrence': 10, 'self': 0, 'children': {'gene3': {'occurrence': 10, 'self': 10, 'children': {}}}},
                                                            'gene3': {'occurrence': 10, 'self': 0, 'children': {'gene2': {'occurrence': 10, 'self': 10, 'children': {}}}},
                                                           }
                 },
        'gene2': {'occurrence': 17, 'self': 2, 'children': {'gene1': {'occurrence': 10, 'self': 0, 'children': {'gene3': {'occurrence': 10, 'self': 10, 'children': {}}}},
                                                            'gene3': {'occurrence': 15, 'self': 5, 'children': {'gene1': {'occurrence': 10, 'self': 10, 'children': {}}}},
                                                           }
                 },
        'gene3': {'occurrence': 15, 'self': 0, 'children': {'gene1': {'occurrence': 10, 'self': 0, 'children': {'gene2': {'occurrence': 10, 'self': 10, 'children': {}}}},
                                                            'gene2': {'occurrence': 15, 'self': 5, 'children': {'gene1': {'occurrence': 10, 'self': 10, 'children': {}}}},
                                                           }
                 },
        'gene4': {'occurrence': 4, 'self': 4, 'children': {}}
       }

在上面的tree_dict

  • self 指的是(子)路径中仅出现节点。例如:gene3 永远不会单独存在,因此 self 的值为 0;而gene2 单独存在2 次,因此self 的值为2。
  • occurrence 指的是(子)路径中节点作为子字符串和整体的出现。


我尝试过的代码?
当我知道解决方案必须是递归函数时,我正在尝试失败迭代方法。类似于这个问题的东西:How to transform a list into a hierarchy dict。但我无法在这个方向上取得任何进展。

【问题讨论】:

    标签: python python-3.x recursion data-structures tree


    【解决方案1】:

    试试这个:

    data = [{'sequence': 'gene1__gene2__gene3', 'occurrence': 10},
            {'sequence': 'gene2__gene3', 'occurrence': 5},
            {'sequence': 'gene2', 'occurrence': 2},
            {'sequence': 'gene4', 'occurrence': 4}]
    
    tree_dict = {}
    
    def generate_tree(sequence, occurrence, curr_dict):
        gene_list = sequence.split('__')
        for gene in gene_list:
            if gene in curr_dict:
                curr_dict[gene]['occurrence'] += occurrence
            else:
                curr_dict[gene] = {'occurrence': occurrence, 'self': 0, 'children': {}}
            updated_list = gene_list.copy()
            updated_list.remove(gene)
            updated_sequence = '__'.join(updated_list)
            if updated_sequence != '':
                generate_tree(updated_sequence, occurrence, curr_dict[gene]['children'])
            else:
                curr_dict[gene]['self'] += occurrence
    
    for item in data:
        generate_tree(item['sequence'], item['occurrence'], tree_dict)
    
    print(tree_dict)
    

    【讨论】:

      猜你喜欢
      • 2015-07-23
      • 2011-11-18
      • 2022-08-16
      • 1970-01-01
      • 2019-02-10
      • 2015-11-14
      • 2016-10-25
      • 2022-12-04
      • 1970-01-01
      相关资源
      最近更新 更多