【问题标题】:How to work out percentage of total with groupby for specific columns in a pandas dataframe?如何使用 groupby 计算 pandas 数据框中特定列的总数百分比?
【发布时间】:2020-12-11 08:59:25
【问题描述】:

我有以下数据框:

df = pd.DataFrame( columns = ['Name','Status','Profit','Promotion','Product','Visits']) 
df['Name'] = ['Andy','Andy','Brad','Brad','Cynthia','Cynthia']
df['Status'] =['Old','New','Old','New','Old','New'] 
df['Profit'] = [140,60,110,90,20,100]
df['Promotion'] = [25,30,40,10,22,36]
df['Product'] = [8,6,18,10,7,12]
df['Visits'] = [11,4,7,3,12,5]
df['Month'] = 'Jan'

我想按“名称”计算“利润”、“促销”和“产品”列的总百分比,以实现以下数据框:

df['Profit'] = [70,30,55,45,17,83]
df['Promotion'] = [45,55,80,20,38,62]
df['Product'] = [57,43,64,36,37,63]
df

我尝试按“名称”、“状态”和“月份”进行分组,并尝试执行与此处提供的解决方案 Pandas percentage of total with groupby 类似的操作,但似乎无法获得我想要的输出。

【问题讨论】:

    标签: python pandas percentage


    【解决方案1】:

    GroupBy.transform 用于每个Names 的总和,除以原始列,乘以100,最后round

    cols = ['Profit','Promotion','Product']
    
    print (df.groupby('Name')[cols].transform('sum'))
       Profit  Promotion  Product
    0     200         55       14
    1     200         55       14
    2     200         50       28
    3     200         50       28
    4     120         58       19
    5     120         58       19
    
    df[cols] = df[cols].div(df.groupby('Name')[cols].transform('sum')).mul(100).round()
    print (df)
          Name Status  Profit  Promotion  Product  Visits Month
    0     Andy    Old    70.0       45.0     57.0      11   Jan
    1     Andy    New    30.0       55.0     43.0       4   Jan
    2     Brad    Old    55.0       80.0     64.0       7   Jan
    3     Brad    New    45.0       20.0     36.0       3   Jan
    4  Cynthia    Old    17.0       38.0     37.0      12   Jan
    5  Cynthia    New    83.0       62.0     63.0       5   Jan
    

    【讨论】:

      猜你喜欢
      • 2018-03-24
      • 1970-01-01
      • 1970-01-01
      • 2021-02-20
      • 2022-11-21
      • 1970-01-01
      • 2022-06-13
      • 2019-01-26
      • 2014-06-16
      相关资源
      最近更新 更多