【问题标题】:Sort pandas dataframe rows according to a string value column根据字符串值列对 pandas 数据帧行进行排序
【发布时间】:2020-05-19 02:15:41
【问题描述】:

我有以下数据框:

        month       price
0       April  102.478015
1      August   94.868053
2    December   97.278205
3    February  100.114510
4     January   99.419109
5        July   93.402928
6        June   96.114224
7       March  101.297762
8         May  102.905340
9    November   97.952169
10    October   95.606478
11  September   94.226803

我希望月份顺序一致(1 月在第一行,直到 12 月在第 12 行)。请问我该怎么办?

如果需要,可以复制这个dataframe然后执行

pd.read_clipboard(sep='\s\s+')

在你的 jupyter notebook 上拥有数据框

【问题讨论】:

    标签: python string pandas sorting rows


    【解决方案1】:

    将值转换为有序的categoricals,因此可以使用DataFrame.sort_values

    cats = ['January','February','March','April','May','June',
            'July','August','September','October','November','December']
    df['month'] = pd.CategoricalIndex(df['month'], ordered=True, categories=cats)
    #alternative
    #df['month'] = pd.Categorical(df['month'], ordered=True, categories=cats)
    df = df.sort_values('month')
    print (df)
            month       price
    4     January   99.419109
    3    February  100.114510
    7       March  101.297762
    0       April  102.478015
    8         May  102.905340
    6        June   96.114224
    5        July   93.402928
    1      August   94.868053
    11  September   94.226803
    10    October   95.606478
    9    November   97.952169
    2    December   97.278205
    

    【讨论】:

    • 可以使用calendar.month_name生成cats变量
    • 为什么是pd.CategoricalIndex 而不是pd.Categorical @jez?
    • @yatu - 因为在一些较旧的 pandas 版本中 Categorical 失败,只能工作 CategoricalIndex
    猜你喜欢
    • 2017-06-30
    • 1970-01-01
    • 2021-10-13
    • 2013-10-15
    • 2020-02-29
    • 2021-11-04
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多