【问题标题】:aggregate and groupby data then sort according to a column聚合和分组数据然后根据列排序
【发布时间】:2021-04-06 05:19:08
【问题描述】:

在如下数据集中:

data = pd.DataFrame({'AuthorName':["Wendelaar Bonga"," Sjoerd E.", "Grätzel"," Michael", "Willett", "Walter C.",
                             "Kessler", "Ronald C.", "Witten, Edward", "Wang, Zhong Lin"],
                 'seniorityLevel':[10, 45, 13, 89, 3, 8, 19, 22, 10, 59],
               'SubjectField': ["Biomedical Engineering", "Inorganic & Nuclear Chemistry",
                                "Organic Chemistry", "Biomedical Engineering", "Developmental Biology",
                                "Mechanical Engineering & Transports", "Biomedical Engineering", "Microbiology",
                                "Cardiovascular System & Hematology", "Biomedical Engineering"],
              'NumberOfPapers':[109, 284, 34, 109, 78, 90, 109, 54, 32, 109],
              })

我需要计算经验级别的最小值、平均值、中值和最大值以及每个学科领域的论文数量。当数据按平均资历级别排序时,显示前 10 和后 10 表。 我试过这段代码:

d=data.groupby(["SubjectField"]).agg({'seniorityLevel':['min', 'mean', 'median', 'max'],'NumberOfPapers':['min', 'mean', 'median', 'max']})

但我无法按资历级别对表格进行排序

【问题讨论】:

    标签: python pandas pandas-groupby aggregate-functions data-analysis


    【解决方案1】:

    尝试使用元组对 multiIndex 标题列进行排序。

    d_sort = d.sort_values(('seniorityLevel', 'mean'))
    
    pd.concat([d_sort.head(2), d_sort.tail(2)])
    

    输出(这里只显示顶部 2 和底部 2):

                                        seniorityLevel                   NumberOfPapers                 
                                                   min   mean median max            min mean median  max
    SubjectField                                                                                        
    Developmental Biology                            3   3.00      3   3             78   78     78   78
    Mechanical Engineering & Transports              8   8.00      8   8             90   90     90   90
    Biomedical Engineering                          10  44.25     39  89            109  109    109  109
    Inorganic & Nuclear Chemistry                   45  45.00     45  45            284  284    284  284
    

    【讨论】:

      猜你喜欢
      • 2021-11-24
      • 2022-01-23
      • 1970-01-01
      • 2015-09-09
      • 2018-03-30
      • 2017-05-30
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多