【发布时间】:2018-02-07 18:15:12
【问题描述】:
我收到了以下格式的 pandas 数据框:
datetime name mtd code
0 2017-09-07 00:00:08 profile/log GET 300
1 2017-09-07 00:00:17 profile/log PUT 300
3 2017-09-07 00:00:19 unknown PUT 200
4 2017-09-07 00:00:21 extras/dashboard GET 300
5 2017-09-07 00:00:23 extras/stats GET 300
6 2017-09-07 00:00:26 extras/dashboard GET 300
7 2017-09-07 00:00:29 extras/authz-profile/check GET 200
8 2017-09-07 00:00:34 about PUT 300
9 2017-09-07 00:00:36 extras/fav GET 304
2 2017-09-07 00:00:44 extras/store GET 200
我想做的是:
- 计算每个名称-mtd 对的出现次数其中响应代码以 3 开头 从
2017-09-07 00:00:10到2017-09-07 00:00:40开始的每 5 秒间隔
理想的输出是:
datetime_start pair 3??_count
2017-09-07 00:00:10 profile/log - GET 2
2017-09-07 00:00:15 - 0
2017-09-07 00:00:20 extras/dashboard - GET 1
2017-09-07 00:00:20 extras/stats - GET 1
2017-09-07 00:00:25 extras/dashboard - GET 1
2017-09-07 00:00:30 about - PUT 1
2017-09-07 00:00:35 extras/fav - GET 1
2017-09-07 00:00:40 - 0
我如何使用 pandas 做到这一点?
我已经编写了一段代码来创建时间段,如desirable output 表中所示,但不知道如何计算 3?每 5 秒时间段的名称-mtd 对。我将非常感谢任何帮助!
data['datetime_start'] = pd.date_range(start="2017-09-07 00:00:10", end="2017-09-07 00:00:40", freq="5S")
【问题讨论】:
-
棘手的一个!听起来像一个考试问题或什么的。您能否分享您已经尝试过的 groupby 代码,然后有人可以从中构建?
标签: python pandas pandas-groupby