【问题标题】:ranking a stacked Bar plot对堆积条形图进行排名
【发布时间】:2018-09-30 04:41:39
【问题描述】:

所有,

我有一个数据框,我已经对它进行了分组和排序,看起来像这样

 brkrcy=data[data['upload_date']==Date].groupby(['CPTY','currency'], as_index=False).agg({"Gross Loan Amount": "sum"})
    brkrcy=brkrcy.sort_values(by=['CPTY', 'Gross Loan Amount'], ascending=[True, False])
    brkrcy = brkrcy.set_index('CPTY')

双重排名

       currency  Gross Loan Amount
CPTY                              
BARC        RUB       2.178780e+07
BARC        ZAR       7.779714e+07
BARC        JPY       1.227676e+09
BARC        EUR       3.301354e+09
BARC        GBP       5.002534e+09
BARC        USD       6.667446e+09
BMON        CAD       2.018614e+08
BMON        GBP       4.096820e+08
BMON        USD       6.510318e+08
BNP         CAD       2.349053e+08
BNP         JPY       1.523716e+09
BNP         GBP       3.234833e+09
BNP         USD       4.576760e+09
BNP         EUR       4.935927e+09
CALIP       EUR       1.832390e+07
CALIP       USD       1.448161e+09
CALIP       GBP       3.492144e+09
CANTR       USD       3.987880e+08
CIBC        CAD       6.851792e+08
CIBC        GBP       8.861776e+08
CITI        CZK       7.549203e+06

brkrcy.set_index('currency',append=True)['Gross Loan Amount'].unstack().plot(kind="bar",stacked=True,figsize=(10,8))
plt.ylabel('Gross Loan Amount in Billions')
plt.show()

如您所见,虽然它是双重排名,但堆积条形图并不是按降序排列的。请问怎么改啊?

【问题讨论】:

  • 您的输出未反映顶部行,因为 Gross Loan Amount 似乎是按升序排序,而不是像代码尝试那样按降序排序。
  • 我不知道为什么。我其实不介意。我正在寻找的是一个按降序排列的堆积条形图。知道怎么做吗?谢谢

标签: python python-3.x pandas dataframe cptbarplot


【解决方案1】:

假设您的意思是堆叠图上的下降条,请考虑添加一个辅助Total 列,按每个CPTY 对图数据框的所有货币字段求和。使用这个新列按降序对数据进行排序,然后在绘图前删除辅助列:

plot_df = brkrcy.set_index('currency',append=True)['Gross Loan Amount'].unstack()
plot_df['Total'] = plot_df.apply('sum', axis=1)          # HELPER COLUMN

plot_df.sort_values('Total', ascending=False)\
       .drop(columns=['Total'])\
       .plot(kind="bar", stacked=True, figsize=(10,8))

plt.ylabel('Gross Loan Amount in Billions')
plt.show()

使用希望复制您的实际数据的随机数据进行演示(为可重复性播种):

数据

import numpy as np
import pandas as pd

np.random.seed(42018)

CPTY = ["BARC", "BMON", "CALIP", "BNP", "CIBC", "CANTR", "CITI"]
currency = ["RUB", "ZAR", "JPY", "EUR", "GBP", "USD", "CAD"]

data = pd.DataFrame({'CPTY': ["".join(np.random.choice(CPTY,1)) for _ in range(50)],
                     'currency': ["".join(np.random.choice(currency,1)) for _ in range(50)],
                     'Gross Loan Amount': abs(np.random.randn(50))*10000000
                    }, columns = ['CPTY','currency','Gross Loan Amount'])

brkrcy = data.groupby(['CPTY','currency'], as_index=False).agg({"Gross Loan Amount": "sum"})\
             .sort_values(by=['CPTY', 'Gross Loan Amount'], ascending=[True, False])\
             .set_index('CPTY')
print(brkrcy.head(10))
#       currency  Gross Loan Amount
# CPTY                             
# BARC       JPY       3.854475e+07
# BARC       RUB       9.201352e+06
# BARC       USD       7.744341e+06
# BMON       EUR       2.780286e+07
# BMON       JPY       2.365747e+07
# BMON       CAD       8.523440e+06
# BNP        RUB       1.268484e+07
# BNP        GBP       8.149266e+06
# BNP        EUR       7.575220e+06
# CALIP      USD       3.387214e+07

情节

import matplotlib.pyplot as plt

plot_df = brkrcy.set_index('currency',append=True)['Gross Loan Amount'].unstack()
plot_df['Total'] = plot_df.apply('sum', axis=1)

plot_df.sort_values('Total', ascending=False)\
       .drop(columns=['Total'])\
       .plot(kind="bar", stacked=True, figsize=(10,8))

plt.ylabel('Gross Loan Amount in Billions')
plt.show()

【讨论】:

  • 非常感谢。这正是我想要的。谢谢!
  • 没问题。很高兴为您提供帮助!
猜你喜欢
  • 1970-01-01
  • 2018-12-21
  • 2021-10-10
  • 2021-05-26
  • 2012-03-05
  • 2018-05-09
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多