【发布时间】:2017-01-30 17:51:40
【问题描述】:
我有数据框:
ID,used_at,active_seconds,subdomain,visiting,category
123,2016-02-05 19:39:21,2,yandex.ru,2,Computers
123,2016-02-05 19:43:01,1,mail.yandex.ru,2,Computers
123,2016-02-05 19:43:13,6,mail.yandex.ru,2,Computers
234,2016-02-05 19:46:09,16,avito.ru,2,Automobiles
234,2016-02-05 19:48:36,21,avito.ru,2,Automobiles
345,2016-02-05 19:48:59,58,avito.ru,2,Automobiles
345,2016-02-05 19:51:21,4,avito.ru,2,Automobiles
345,2016-02-05 19:58:55,4,disk.yandex.ru,2,Computers
345,2016-02-05 19:59:21,2,mail.ru,2,Computers
456,2016-02-05 19:59:27,2,mail.ru,2,Computers
456,2016-02-05 20:02:15,18,avito.ru,2,Automobiles
456,2016-02-05 20:04:55,8,avito.ru,2,Automobiles
456,2016-02-05 20:07:21,24,avito.ru,2,Automobiles
567,2016-02-05 20:09:03,58,avito.ru,2,Automobiles
567,2016-02-05 20:10:01,26,avito.ru,2,Automobiles
567,2016-02-05 20:11:51,30,disk.yandex.ru,2,Computers
我需要做的
group = df.groupby(['category']).agg({'active_seconds': sum}).rename(columns={'active_seconds': 'count_sec_target'}).reset_index()
但我想添加与
相关的条件df.groupby(['category'])['ID'].count()
如果category 的计数小于5,我想删除这个类别。
不知道怎么写这个条件。
【问题讨论】:
-
在您的示例数据中,不会删除任何类别,但是您是否在追求类似
df.groupby('category').filter(lambda x: len(x) >= 5)
标签: python pandas filter group-by conditional-statements