【发布时间】:2021-05-12 20:26:28
【问题描述】:
我正在努力根据 MultiIndex 的一个级别对数据透视表进行排序。 我的目标是根据基本上有效的值列表对关卡中的值进行排序。 但我也想保留其他关卡的原始顺序。
import pandas as pd
import numpy as np
import random
group_size = 3
n = 10
df = pd.DataFrame({
'i_a': list(np.arange(0, group_size))*n,
'i_b': random.choices(list("ARBMC"), k=n*group_size),
'value': np.random.randint(0, 100, size=n*group_size),
})
pt = pd.pivot_table(
df,
index=['i_a', 'i_b'],
values=['value'],
aggfunc='sum'
)
# The pivot table looks like this
value
i_a i_b
0 A 48
B 55
C 161
M 41
R 126
1 A 60
B 236
C 99
M 30
R 202
2 A 22
B 144
C 30
M 146
R 168
# defined order for i_b
ORDER = {
"A": 0,
"R": 1,
"B": 2,
"M": 3,
"C": 4,
}
def order_by_list(value, ascending=True):
try:
idx = ORDER[value]
except KeyError:
# place items which are not available at the last place
idx = len(ORDER)
if not ascending:
# reverse the order
idx = -idx
return idx
def sort_by_ib(df):
return pt.sort_index(level=["i_b"],
key=lambda index: index.map(order_by_list),
sort_remaining=False
)
pt_sorted = pt.pipe(sort_by_ib)
# i_a index of pt_sorted is rearranged what i dont want
value
i_a i_b
0 A 48
1 A 60
2 A 22
0 R 126
1 R 202
2 R 168
0 B 55
1 B 236
2 B 144
0 M 41
1 M 30
2 M 146
0 C 161
1 C 99
2 C 30
# Instead, The sorted pivot table should look like this
value
i_a i_b
0 A 48
R 126
B 55
M 41
C 161
1 A 60
R 202
B 236
M 30
C 99
2 A 22
R 168
B 144
M 146
C 30
首选/推荐的方法是什么?
【问题讨论】:
标签: python pandas sorting multi-index