【问题标题】:Creating a multiindexed `DataFrame` with a nested dictionary使用嵌套字典创建多索引“DataFrame”
【发布时间】:2017-03-08 19:20:58
【问题描述】:

此问题与this one 有关。这一次,我想更进一步。给定一个像这样的字典:

dd = {0: {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}},

      1: {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}}}

或类似的列表:

ll = [{"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}},

      {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}}]

我想构造一个DataFrame 喜欢:

                          russell                            godel                        cantor
                    score    ping                    score    ping                 score    ping
0     0.17473916938994682      40       0.3443303845926545      47   0.43576522521017247      42
1      0.7341005512329682      22      0.14682222267827938      81    0.5662517436162526      59

我们可以看到列索引是MultiIndex。有没有办法做到这一点?如果我尝试pandas.DataFrame.from_dict(dd, orient="index")pandas.DataFrame(ll),我会得到:

                                      russell                                       godel                                      cantor
0  {'score': 0.17473916938994682, 'ping': 40}   {'score': 0.3443303845926545, 'ping': 47}  {'score': 0.43576522521017247, 'ping': 42}
1   {'score': 0.7341005512329682, 'ping': 22}  {'score': 0.14682222267827938, 'ping': 81}   {'score': 0.5662517436162526, 'ping': 59}

这不是我想要的。

【问题讨论】:

    标签: python pandas dictionary nested series


    【解决方案1】:

    这也可以。请注意,您的嵌套字典并未真正嵌套以便于翻译。

     pd.concat({key:pd.DataFrame(dd[key]) for key in dd.keys()}).unstack()
    Out[104]: 
      cantor           godel           russell          
        ping     score  ping     score    ping     score
    0   73.0  0.463084  94.0  0.954662    76.0  0.732291
    1   28.0  0.778905  81.0  0.984285    36.0  0.094173
    

    简而言之,使用 concat 创建多索引 df 非常简单。你只需要一个数据框字典

    【讨论】:

      【解决方案2】:

      现在更复杂了,但Paneltransposeto_frameunstack 可以提供帮助:

      df = pd.Panel(dd).transpose(2,0,1).to_frame().unstack()
      print (df)
            cantor           godel           russell          
      minor   ping     score  ping     score    ping     score
      major                                                   
      0       69.0  0.050641  51.0  0.765994    20.0  0.935196
      1       91.0  0.398624  33.0  0.408681    75.0  0.464876
      

      【讨论】:

      • 您在 Pandas 方面为我提供了很多帮助。非常感谢。
      猜你喜欢
      • 2017-03-08
      • 1970-01-01
      • 2022-07-02
      • 1970-01-01
      • 1970-01-01
      • 2021-09-14
      • 2017-06-26
      • 2020-05-24
      • 2021-11-24
      相关资源
      最近更新 更多