您可以先使用创建pivot_table,然后将其与df 合并回来,并在观察到NaN 时替换这些值
Example of df
Pclass Gender Age
3 1 22
1 0 38
2 1 27
3 0 NaN
Pivot table
Age
Gender 0 1
PClass
1 40 35
2 30 28
3 25 21
import pandas as pd
import numpy as np
df = pd.DataFrame(columns=['PClass','Gender','Age'])
df['PClass'] = [3,1,2,3]
df['Gender'] = [1,0,1,0]
df['Age'] = [22,38,27,np.nan]
df_pivot = pd.pivot_table(df,index=['PClass'],columns=['Gender'],values=['Age'],aggfunc='mean',fill_value=0) ### you can choose your own aggfunc
### I have taken `mean` here , but there ae a bunch of available options
df_pivot = df_pivot.unstack().reset_index().rename(columns={0:'Avg_Age_Pivot'})
df = pd.merge(df,df_pivot[['PClass','Gender','Avg_Age_Pivot']],on=['PClass','Gender'])
def replace_na(inp):
inp = inp.values
if pd.isnull(inp[0]):
return inp[1]
return inp[0]
df['Age'] = df[['Age','Avg_Age']].apply(replace_na,axis=1)
df _pivot O/P --->
>>> pd.pivot_table(df,index=['PClass'],columns=['Gender'],values=['Age'],aggfunc='mean') ### you can choose your own aggfunc
Age
Gender 0 1
PClass
1 38.0 NaN
2 NaN 27.0
3 NaN 22.0
您可以进一步决定保留或删除 Avg_Age_Pivot 列。
我还注意到,根据您提供的数据量,pivot_table 中有 NaN 值,因此您看不到当前 df 值的预期结果