【问题标题】:Pandas unstack should not sort remaining indexesPandas unstack 不应该对剩余的索引进行排序
【发布时间】:2017-08-01 09:53:58
【问题描述】:

我在问自己是否可以取消堆叠多索引数据帧的一层,这样返回的数据帧的剩余索引就不会被排序! 代码示例:

arrays = [["room1", "room1", "room1", "room1", "room1", "room1",
           "room2", "room2", "room2", "room2", "room2", "room2"],
          ["bed1", "bed1", "bed1", "bed2", "bed2", "bed2",
           "bed1", "bed1", "bed1", "bed2", "bed2", "bed2"],
          ["blankets", "pillows", "all", "blankets", "pillows", "all",
           "blankets", "pillows", "all", "blankets", "pillows", "all"]]

tuples = list(zip(*arrays))

index = pd.MultiIndex.from_tuples(tuples, names=['first index', 
                                                 'second index', 'third index'])

series = pd.Series([1, 2, 3, 1, 1, 2, 2, 2, 4, 2, 1, 3 ], index=index)

series

first index  second index  third index
room1        bed1          blankets       1
                           pillows        2
                           all            3
             bed2          blankets       1
                           pillows        1
                           all            2
room2        bed1          blankets       2
                           pillows        2
                           all            4
             bed2          blankets       2
                           pillows        1
                           all            3

拆开第二个索引:

series.unstack(1)

second index             bed1  bed2
first index third index            
room1       all             3     2
            blankets        1     1
            pillows         2     1
room2       all             4     3
            blankets        2     2
            pillows         2     1

问题在于第三个索引的顺序发生了变化,因为该索引是自动按字母顺序排序的。现在,'all' 行是'blankets' 和'pillow' 行的总和,它是第一行而不是最后一行。那么如何解决这个问题呢?似乎没有一个选项可以阻止“unstack”自动排序。此外,似乎不可能通过 myDataFrame.sort_index(..., key=['some_key']) 之类的键对数据帧的索引进行排序。

【问题讨论】:

    标签: python pandas sorting


    【解决方案1】:

    一种可能的解决方案是reindexreindex_axis 带有参数level=1

    s = series.unstack(1).reindex(['blankets','pillows','all'], level=1)
    print (s)
    second index             bed1  bed2
    first index third index            
    room1       blankets        1     1
                pillows         2     1
                all             3     2
    room2       blankets        2     2
                pillows         2     1
                all             4     3
    

    s = series.unstack(1).reindex_axis(['blankets','pillows','all'], level=1)
    print (s)
    second index             bed1  bed2
    first index third index            
    room1       blankets        1     1
                pillows         2     1
                all             3     2
    room2       blankets        2     2
                pillows         2     1
                all             4     3
    

    更动态的解决方案:

    a = series.index.get_level_values('third index').unique()
    print (a)
    Index(['blankets', 'pillows', 'all'], dtype='object', name='third index')
    
    s = series.unstack(1).reindex_axis(a, level=1)
    print (s)
    second index             bed1  bed2
    first index third index            
    room1       blankets        1     1
                pillows         2     1
                all             3     2
    room2       blankets        2     2
                pillows         2     1
                all             4     3
    

    【讨论】:

      猜你喜欢
      • 2016-10-22
      • 1970-01-01
      • 1970-01-01
      • 2022-08-19
      • 2019-02-08
      • 2013-11-29
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多