【发布时间】:2017-08-01 09:53:58
【问题描述】:
我在问自己是否可以取消堆叠多索引数据帧的一层,这样返回的数据帧的剩余索引就不会被排序! 代码示例:
arrays = [["room1", "room1", "room1", "room1", "room1", "room1",
"room2", "room2", "room2", "room2", "room2", "room2"],
["bed1", "bed1", "bed1", "bed2", "bed2", "bed2",
"bed1", "bed1", "bed1", "bed2", "bed2", "bed2"],
["blankets", "pillows", "all", "blankets", "pillows", "all",
"blankets", "pillows", "all", "blankets", "pillows", "all"]]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first index',
'second index', 'third index'])
series = pd.Series([1, 2, 3, 1, 1, 2, 2, 2, 4, 2, 1, 3 ], index=index)
series
first index second index third index
room1 bed1 blankets 1
pillows 2
all 3
bed2 blankets 1
pillows 1
all 2
room2 bed1 blankets 2
pillows 2
all 4
bed2 blankets 2
pillows 1
all 3
拆开第二个索引:
series.unstack(1)
second index bed1 bed2
first index third index
room1 all 3 2
blankets 1 1
pillows 2 1
room2 all 4 3
blankets 2 2
pillows 2 1
问题在于第三个索引的顺序发生了变化,因为该索引是自动按字母顺序排序的。现在,'all' 行是'blankets' 和'pillow' 行的总和,它是第一行而不是最后一行。那么如何解决这个问题呢?似乎没有一个选项可以阻止“unstack”自动排序。此外,似乎不可能通过 myDataFrame.sort_index(..., key=['some_key']) 之类的键对数据帧的索引进行排序。
【问题讨论】: