【发布时间】:2017-11-08 16:12:46
【问题描述】:
我有一个数据框:
import pandas as pd
import numpy as np
df=pd.DataFrame.from_items([('STAND_ID',[1,1,2,3,3,3]),('Species',['Conifer','Broadleaves','Conifer','Broadleaves','Conifer','Conifer']),
('Height',[20,19,13,24,25,18]),('Stems',[1500,2000,1000,1200,1700,1000]),('Volume',[200,100,300,50,100,10])])
STAND_ID Species Height Stems Volume
0 1 Conifer 20 1500 200
1 1 Broadleaves 19 2000 100
2 2 Conifer 13 1000 300
3 3 Broadleaves 24 1200 50
4 3 Conifer 25 1700 100
5 3 Conifer 18 1000 10
我想按 STAND_ID 和 Species 分组,对 Height 和 Stems 应用加权平均值,Volume 作为权重并取消堆叠。
所以我试试:
newdf=df.groupby(['STAND_ID','Species']).agg({'Height':lambda x: np.average(x['Height'],weights=x['Volume']),
'Stems':lambda x: np.average(x['Stems'],weights=x['Volume'])}).unstack()
这给了我错误:
builtins.KeyError: '高度'
我该如何解决这个问题?
【问题讨论】: