Python - 如何将嵌套字典加载到 Pandas 数据框中？答案

【问题标题】：Python - how to load nested dictionary into Pandas dataframe?Python - 如何将嵌套字典加载到 Pandas 数据框中？
【发布时间】：2019-10-03 20:49:18
【问题描述】：

我有一个结构如下的长嵌套字典，如何将其加载到 Pandas 数据框中？ Feed、Spindle Speed和Tool的子键始终保持不变，但上面的两个级别（Heading、N1等和4001、4002等是唯一的或至少是唯一的在整个字典中以独特的顺序排列。

我知道这样的事情：

pd.DataFrame.from_dict({(i,j): dictionary[i][j] 
                           for i in dictionary.keys() 
                           for j in dictionary[i].keys()},
                       orient='index')

但这看起来像一个数据透视表，我更喜欢一个带有冗余信息的数据框（谎言4001）来运行整个列。

{
    "4001": {
        "Heading": {
            "Feed": [],
            "Spindle Speed": [],
            "Tool": []
        },
        "N1": {
            "Feed": [],
            "Spindle Speed": [],
            "Tool": [
                "0800"
            ]
        },
        "N10 ": {
            "Feed": [
                0.01,
                0.0006,
                0.0001,
                0.0006,
                0.0001,
                0.0006,
                0.0002,
                0.02,
                0.0004
            ],
            "Spindle Speed": [
                "M3S2630"
            ],
            "Tool": [
                "1616"
            ]
        }
    },
    "4002": {
        "Heading": {
            "Feed": [],
            "Spindle Speed": [],
            "Tool": []
        },
        "N1": {
            "Feed": [],
            "Spindle Speed": [],
            "Tool": [
                "9900"
            ]
        },
        "N10": {
            "Feed": [
                0.01,
                0.001,
                0.0004,
                0.001,
                0.005
            ],
            "Spindle Speed": [],
            "Tool": [
                "3838"
            ]
        }
    },     
    "4003": {...
             ...
             ...

理想情况下，数据框看起来像这样：

Program Operation Number    Feed        Tool      Spindle Speed
4001    Heading             []          []        []
4001    N1                  []          ['0800']  []
4001    N10                 [0.01, ...] ['1616']  ['M3S2630']

【问题讨论】：

您希望数据框是什么样的（列的标题和内容，行的内容）？
还有……df.reset_index()呢？
我已经用df 的所需“外观”更新了我的问题。谢谢你的慰问。我不确定您所说的 reset_index() 是什么意思？

标签： python pandas dictionary nested

【解决方案1】：

你快到了。您只需重置 multi_index 并提供正确的列名：

pd.DataFrame.from_dict({(i,j): dictionary[i][j] 
                           for i in dictionary.keys() 
                           for j in dictionary[i].keys()},
                       orient='index').reset_index().rename(
    {'level_0': 'Program', 'level_1': 'Operation Number'}, axis=1)

【讨论】：

【解决方案2】：

在导入 pandas 后运行这行代码，将您的数据框配置为不“稀疏”：

pd.set_option('display.multi_sparse', False)

来自pandas docs：

选项：display.multi_sparse
默认值：真
功能：“稀疏化”MultiIndex显示（组内不显示外层重复元素）

使用您提供的前两个分组输出：

【讨论】：