【发布时间】:2021-08-20 17:33:21
【问题描述】:
我想统计按城市分组的 3 天窗口内的唯一客户
输入:
df = pd.DataFrame([['1A', 'Cairo', '2020-12-01'],
["2A", 'Cairo', '2020-12-01'],
['1A', 'Cairo', '2020-12-02'],
['1A', 'Cairo', '2020-12-03'],
['3A', 'Alex', '2020-12-01'],
['3A', 'Alex', '2020-12-02'],
['3A', 'Alex', '2020-12-03'],
['4A', 'Giza', '2020-12-02'],
['4A', 'Giza', '2020-12-02'],
['5A', 'Giza', '2020-12-03'],
['6A', 'Giza', '2020-12-01']], columns=
['customer_id', 'city', 'day'])
预期输出:
output = pd.DataFrame([['Alex', '2020-12-01',1],
['Alex', '2020-12-02',1],
['Alex', '2020-12-03',1],
['Cairo', '2020-12-01',2],
['Cairo', '2020-12-02',2],
['Cairo', '2020-12-03',2],
['Giza', '2020-12-01',1],
['Giza', '2020-12-02',2],
['Giza', '2020-12-03',3]], columns=
['city', 'day', 'unique_customers_last3Days'])
我试过了:
df['day'] = pd.to_datetime(df['day'])
df.set_index('day',inplace=True)
df.sort_index(inplace=True)
df.groupby('city').rolling("3D").agg({'customer_id':'nun'})
但它给了我错误
AttributeError: 'nunique' is not a valid function for 'RollingGroupby' object
【问题讨论】:
标签: python-3.x pandas distinct-values