在 python 中自定义排序 ggplot2答案

【问题标题】：Custom Sorting gglot2 in python在 python 中自定义排序 ggplot2
【发布时间】：2022-12-04 05:37:24
【问题描述】：

我正在研究 ggplt 可视化，它将国家总支出从最高到最低绘制出来。由于有很多小值，我将几个小类别汇总到“其他”类别中。我无法找到一种方法将“其他”类别移至末尾并使其余类别按降序排列

ggplot(df_sorted, aes(x = 'reorder(customer_country, Total_Expenditure, fun=sum)', y = 'Total_Expenditure', fill='Total_Expenditure'))\
    + geom_bar(stat="identity")\
        + scale_x_discrete()\
            + coord_flip()\
                +scale_fill_cmap(cmap_name="RdYlGn")

enter image description here

在条形图的底部有类别其他

【问题讨论】：

标签： python ggplot2 plotnine

【解决方案1】：

通常，您可以在 ggplot 之外自定义对数据框进行排序（只需使用一些 pandas），并且不需要在绘图美学内部重新排序。

下面的代码针对 plotline 附带的 diamonds 数据集演示了这一点，其中一个因素水平（“Premium”）移至底部，而所有其他因素水平保持排序。

边注：请在您的下一个问题中包含（至少一个子集）您的实际数据框以获得完全可重现的示例，或者使用其中一个库提供的数据集来演示问题/问题。

自定义数据框排序

可能有一种更优雅的方式，但重要的是

from plotnine.data import diamonds
import pandas as pd

# this takes the job of reorder(.., fun=sum) and creates a sorted list of the factor
df = diamonds.groupby('cut', as_index=False).aggregate({'carat': 'sum'})
sorted_levels = df.sort_values('carat')['cut']

# custom reordering of the factor level of interest, 
# here 'Premium' is moved to one end while the rest remains ordered
sorted_custom = ['Premium'] + [l for l in sorted_levels if not l == 'Premium']

# reorder dataframe based on these factor levels
df['cut'] = pd.Categorical(df['cut'], sorted_custom)
df = df.sort_values('cut')

情节（无需进一步排序）


from plotnine import ggplot, aes, geom_bar, scale_x_discrete, coord_flip, scale_fill_cmap
(
    ggplot(df, aes(x = 'cut', y = 'carat', fill='carat'))
    + geom_bar(stat='identity')
    + scale_x_discrete()
    + coord_flip()
    + scale_fill_cmap(cmap_name="RdYlGn")
)

【讨论】：