【问题标题】:Pandas Dataframe groupby column熊猫数据框分组列
【发布时间】:2022-01-12 09:37:49
【问题描述】:

我有一个数据框 df,我需要按部门名称列分组

输入

Employee Name Department Name Subjects Billable Hours Date
Anu CS Java Yes 8 01-03-2021
Anu CS Python Yes 9 02-03-2021
Anu CS SQL No 6 03-03-2021
Anu CS React Yes 5 03-03-2021
Anu CS .Net No 8 04-03-2021
Bala CS SQL No 5 01-03-2021
Bala CS Python Yes 4 01-03-2021
Bala CS Java Yes 2 02-03-2021
Bala CS .Net No 8 03-03-2021
Bala CS React Yes 7 04-03-2021

代码

df = pd.pivot_table(df,index=['Department Name','Employee Name','Billable'],columns=['Subjects'],values='Hours',aggfunc={'Hours': np.sum})

# Resetting index
df = df.reset_index ()
list_column = df.columns

# Adding new columns and calculation
total = df.sum(axis=1)
df.insert(len(df.columns), column='Total', value=total)

available_col = len(df.columns)
Utilization_col = len(df.columns)
utilization_row = len(df.columns)

# Adding Available column
available = 168
df.insert(len(df.columns), column='Available', value=available)

# Adding Utilization column
utilization = (total / available)
df.insert(len(df.columns), column='Utilization', value=utilization)

# Filter dataframe using groupby
df1 = df.groupby(['Department Name','Employee Name'], sort=False ).sum(min_count=1)
df1['Available'] = available

# Adding Billable Utilization column and Non-billable Utilization column
df['Billable'] = np.where(df['Billable'] == 'Billable', 'Billable Utilization','Non Billable Utilization')

df2 = (df.groupby(['Employee Name', 'Billable Status'])[list_column].sum().sum(axis=1).unstack().div(available).mul(100)).round(2)

df = df1.join(df2).reset_index()
df.index = df.index

# Round the column value
df['Total'] = df['Total'].round(2)

df = df.groupby(['Department Name','Employee Name'], as_index=False).sum(min_count=1)

我的输出

预期输出

注意

我尝试使用reset_index,但是groupby函数不起作用。

【问题讨论】:

标签: python pandas dataframe numpy


【解决方案1】:

我尝试了以下功能,我能够得到你想要的输出

def func(x): 
for i in range(1, x['Department Name'].size):
        x['Department Name'].iloc[i] = ''
return x;

df['Department Name'] = df['Department Name'].apply(str)
df = df.groupby('Department 
Name').apply(func).set_index('Department Name')
df.head()

证明

【讨论】:

  • 谢谢@kartik_Bhatnagar,我需要合并部门名称列。
  • 请帮助我,如何合并(groupby)部门列
猜你喜欢
  • 1970-01-01
  • 2017-07-08
  • 2018-07-19
  • 2019-05-03
  • 1970-01-01
  • 2021-12-23
  • 1970-01-01
相关资源
最近更新 更多