【问题标题】:I'm trying binning with the hours column but it's not working [duplicate]我正在尝试使用小时列进行分箱,但它不起作用[重复]
【发布时间】:2020-12-01 06:59:42
【问题描述】:

我有一个 df,每次都有一系列点,我想在一天中的每个小时(从 00:00:00 到 24:00:00)将它们分组到存储桶中

这是我称之为 dfH 的 df 的一部分:

     Hora de início Rodada
00:00:00     636
00:00:07    1184
00:00:09     680
00:00:23     651
00:00:30     539
00:01:16    1076
00:01:44     925
00:02:00     229
00:02:48     452
00:03:06    1143
00:03:55     401
00:04:10    1148
00:04:20     677
00:04:26     552
00:05:10    1182
00:05:44     677
00:06:03     657
00:06:23    1172
00:06:34     428
00:06:59     662
00:07:05    1131
00:07:30     675
00:07:53    1175
00:08:06    1121
00:08:33     564
00:08:43     673
00:08:45     670
00:09:06    1014
00:09:17     449
00:09:19    1156
Name: (TOTAL ESTRELAS, TOTAL), dtype: int64

我正在尝试:

bins = np.arange(0, 24, 1)

groups = dfH.groupby(pd.cut(dfH,bins)).sum()

然后我得到:

(TOTAL ESTRELAS, TOTAL)
(0, 1]      0
(1, 2]      0
(2, 3]      0
(3, 4]      0
(4, 5]      0
(5, 6]      0
(6, 7]      0
(7, 8]      0
(8, 9]      0
(9, 10]     0
(10, 11]    0
(11, 12]    0
(12, 13]    0
(13, 14]    0
(14, 15]    0
(15, 16]    0
(16, 17]    0
(17, 18]    0
(18, 19]    0
(19, 20]    0
(20, 21]    0
(21, 22]    0
(22, 23]    0
Name: (TOTAL ESTRELAS, TOTAL), dtype: int64

也许索引格式不是小时格式所以我尝试了:

dfH.index = pd.to_datetime(dfH.index, format = '%H:%M:%S').dtype.hour

然后我得到了错误:

ValueError:时间数据“TOTAL”与格式“%H:%M:%S”不匹配(匹配)

【问题讨论】:

标签: python pandas bucket


【解决方案1】:

尝试做:

dfH.resample("1h").sum()

如果你的索引是一个日期时间

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-11-17
    • 2020-04-30
    • 2012-10-08
    • 2019-10-12
    • 1970-01-01
    • 2019-11-13
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多