【发布时间】:2019-10-01 13:55:31
【问题描述】:
我的数据集是这样的
Month DayOfWeek Class A1 A2 ... A999
July Monday Bata 7 9 ... 5
July Tuesay Bata 3 1 ... 2
July Sunday Bata 4 5 ... 6
July Monday Adid 9 8 ... 5
July Sunday Adid 4 0 ... 4
Sept Monday Nike 7 5 ... 7
Sept Sunday Nike 8 3 ... 7
Sept Satday Adid 2 7 ... 7
Sept Monday Bata 8 9 ... 4
Oct Monday Nike 4 2 ... 5
Oct Sunday Bata 8 6 ... 3
July Monday Nike NaN NaN NaN
Sept Sunday Nike NaN NaN NaN
Oct Satday Nike NaN NaN NaN
Sept Monday Bata NaN NaN NaN
我想用以前记录的平均值填充 NaNs
我知道我可以使用
df['A1'] = df['A1'].fillna((df['A1'].mean()))
但这是一个不好的方法,因为我有超过 1000 列,以后可能会增加
添加到那个
我想根据 Month 和 DayOfWeek 找到平均值
为了这个记录
July Monday Nike NaN NaN NaN
因此,平均值将仅是具有 Month = July & DayOfWeek = Monday
的记录的平均值我该怎么做?
【问题讨论】:
-
您知道多级索引 (pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html) 吗?我曾经在一个类似的问题中使用过它们,帮助我使用钻取来计算你正在寻找的类似 KPI 的战利品......