遍历嵌套字典以创建数据框答案

【问题标题】：Iterating through Nested Dictionaries to create dataframes遍历嵌套字典以创建数据框
【发布时间】：2021-12-31 05:44:00
【问题描述】：

我正在尝试输出 4 个数据表并获得下面的Expected Output？我想使用Outcomes 字典和iter 并以这样的方式对其进行格式化，这样我就可以将所有列表值放入option 1 和option 2 以及所有行值row 1, row 2...。如何修改代码中的 for 循环以获得预期输出？

代码：

import pandas as pd

def Pandas(infos, title):
  display(pd.DataFrame(infos).style.set_caption(title).set_table_styles([{
    'selector': 'caption',
    'props': [
        ('color', 'red'),
        ('font-size', '16px'),
        ('text-align', 'center')
        ]
    }]))  


for id, info in Outcomes.items():
    for k in info:
        infos = {{f'{x}:': info[k][x]} if isinstance(info[k][x],list) else info[k][x] for x in info[k]}
        Pandas(infos, k)

字典：

Outcomes = {
    'Values':{
        'First': {
            'option 1': [12,345,5412],
            'option 2': [2315,32,1],
            'Additional Option': {'row 1': [232,3,1,3],
                         'row 2': [3,4,5,11],
                         'row 3': [15,6,12,34]}
        },
        'Second': {
            'option 1': [1,4,5,6,2],
            'option 2': [5,6,3,2,1],
            'Additional Option': {'row 1': [-5,3,1,2],
                         'row 2': [4,4,12,11],
                         'row 3': [67,6,5,34]}
        }
    },
    'Values 2':{
        'First': {
            'option 1': [12,345345,512412],
            'option 2': [2315,4,3],
            'Mega':{'row 1': [-45,12,33,1.3],
                    'row 2': [3.5,4.8,5,11]}
        }
    }
}

【问题讨论】：

您是否只需要First 用于顶层的每个键？ Outcomes['Values']['Second'] 也应该导出为数据框吗？
是 First 先出现 Second 在我对预期输出犯错之后紧随其后，标题可能是 Values First ， Values Second ， Values 2 First 对于 options 1 and 2 的标题

标签： python pandas dataframe dictionary indexing

【解决方案1】：

您可以使用递归来遍历您的字典并创建一个扁平化的 DataFrame 字典：

def get_nested_df(dic, concat_key="", df_dic=dict()):
   rows = {k:v for k,v in dic.items() if not isinstance(v, dict)}
   if rows:
      df_dic.update({concat_key: pd.DataFrame.from_dict(rows)})
   for k,v in dic.items():
      if isinstance(v, dict):
         get_nested_df(v, f"{concat_key} {k}", df_dic)
   return df_dic

df_dic = get_nested_df(Outcomes)

for k,v in df_dic.items():
   print(f"{k}\n{v}\n")

输出：

 Values First
   option 1  option 2
0        12      2315
1       345        32
2      5412         1

 Values First Additional Option
   row 1  row 2  row 3
0    232      3     15
1      3      4      6
2      1      5     12
3      3     11     34

 Values Second
   option 1  option 2
0         1         5
1         4         6
2         5         3
3         6         2
4         2         1

 Values Second Additional Option
   row 1  row 2  row 3
0     -5      4     67
1      3      4      6
2      1     12      5
3      2     11     34

 Values 2 First
   option 1  option 2
0        12      2315
1    345345         4
2    512412         3

 Values 2 First Mega
   row 1  row 2
0  -45.0    3.5
1   12.0    4.8
2   33.0    5.0
3    1.3   11.0

【讨论】：

如何在代码中添加一段代码，它要么丢弃 Values 要么 Values 2。
问题是，将全名作为新数据框字典中的键可以让您唯一标识您的数据框。否则，您将拥有两次密钥 Additional Option，这是不可能的。您当然可以在之后拆分您的密钥，仅用于打印。在这种情况下，我会将键与另一个分隔符连接起来：get_nested_df(v, f"{concat_key}/{k}", df_dic)，然后使用/ 拆分键并获取打印所需的部分。