【问题标题】:pandas sorting pivot_table or grouping dataframe?熊猫排序pivot_table或分组数据框?
【发布时间】:2012-12-14 15:55:45
【问题描述】:

我有问题。我这样做了:

In [405]: pippo=ass_t1.pivot_table(['Rotazioni a volume','Distribuzione Ponderata'],rows=['SEGM1','DESC']).sort()

In [406]: pippo
Out[406]: 
                      Distribuzione Ponderata  Rotazioni a volume
SEGM1 DESC                                                       
AD     ACCADINAROLO                    74.040       140249.693409
      ZYMIL AMALAT Z                   90.085       321529.053570
FUN   SPECIALMALAT S                   88.650       120711.182177
NORM   STD INNAROLO                    49.790       162259.216710
       STD P.NAROLO                    52.125      1252174.695695
       STD PLNAROLO                    54.230       213257.829615
      BONTA' MALAT B                   79.280       520454.366419
      DA STD RILGARD                   35.290       554927.497875
      OVANE VT.MANTO                   15.040       466232.639628
      WEIGHT MALAT W                   79.170       118628.572692

我的目标是让每个“SEGM1”按“Distribuzione Ponderata”排序。例如。在“NORM”子集中,第一行应该是“BONTA' MALAT B”,具有更高级别的“Distribuzione Ponderata”。 我能够使用 groupby 方法获得部分结果,但无法设置多个列。 有人可以帮助我吗?

【问题讨论】:

    标签: python sorting pandas pivot


    【解决方案1】:
    import io
    import pandas as pd
    import numpy as np
    
    text = '''\
    SEGM1\tDESC\tDistribuzione Ponderata\tRotazioni a volume
    AD\tACCADINAROLO\t74.040\t140249.693409
    AD\tZYMIL AMALAT Z\t90.085\t321529.053570
    FUN\tSPECIALMALAT S\t88.650\t120711.182177
    NORM\tSTD INNAROLO\t49.790\t162259.216710
    NORM\tSTD P.NAROLO\t52.125\t1252174.695695
    NORM\tSTD PLNAROLO\t54.230\t213257.829615
    NORM\tBONTA' MALAT B\t79.280\t520454.366419
    NORM\tDA STD RILGARD\t35.290\t554927.497875
    NORM\tOVANE VT.MANTO\t15.040\t466232.639628
    NORM\tWEIGHT MALAT W\t79.170\t118628.572692
    '''
    
    df = pd.read_csv(io.BytesIO(text), delimiter = '\t',
                     index_col = (0,1),)
    
    key1 = df.index.labels[0]
    key2 = df['Distribuzione Ponderata'].rank(ascending=False)
    sorter = np.lexsort((key2, key1))
    
    sorted_df = df.take(sorter)
    print(sorted_df)
    

    产量

                          Distribuzione Ponderata  Rotazioni a volume
    SEGM1 DESC                                                       
    AD    ZYMIL AMALAT Z                   90.085       321529.053570
          ACCADINAROLO                     74.040       140249.693409
    FUN   SPECIALMALAT S                   88.650       120711.182177
    NORM  BONTA' MALAT B                   79.280       520454.366419
          WEIGHT MALAT W                   79.170       118628.572692
          STD PLNAROLO                     54.230       213257.829615
          STD P.NAROLO                     52.125      1252174.695695
          STD INNAROLO                     49.790       162259.216710
          DA STD RILGARD                   35.290       554927.497875
          OVANE VT.MANTO                   15.040       466232.639628
    

    我学会了这个技巧here。关键思想是使用numpy.lexsort

    【讨论】:

    • 非常感谢!真的很有用。米
    猜你喜欢
    • 2019-07-05
    • 2018-07-05
    • 2018-05-24
    • 2023-02-07
    • 2021-10-21
    • 2019-05-03
    • 2022-01-12
    • 1970-01-01
    相关资源
    最近更新 更多