为熊猫系列添加时间戳答案

【问题标题】：Adding timestamps to pandas series为熊猫系列添加时间戳
【发布时间】：2021-10-16 10:29:22
【问题描述】：

我正在尝试编写一个代码，该代码将在一小时内计算 .csv 文件中的记录数。所以，例如：

 data = pd.read_csv('2021-07-30.csv',  parse_dates=['time'], infer_datetime_format=True)
    datafiltr = data[data.lane == "Lane 4 Op2"]
    datafiltr['time'] = pd.to_datetime(datafiltr['time'])
    df = datafiltr['time'].groupby(datafiltr.time.dt.to_period("H")).agg('count')

打印：

2021-08-13 13:00    18
2021-08-13 14:00    10
2021-08-13 15:00     2
2021-08-13 16:00     1
2021-08-13 17:00     2
2021-08-13 18:00     4

它工作得很好，但我需要在 12 小时的时间跨度内存储数据。这样的事情将是理想的：

2021-08-13 13:00    18
2021-08-13 14:00    10
2021-08-13 15:00     2
2021-08-13 16:00     1
2021-08-13 17:00     2
2021-08-13 18:00     4
2021-08-13 19:00     0
2021-08-13 20:00     0
...

但我不知道如何解决这个问题，请发送帮助。

【问题讨论】：

标签： python pandas csv matplotlib

【解决方案1】：

您是否尝试过重新采样，但至少有一个时间样本应该是考虑到那里的最长时间

df.set_index('time').resample('H').agg('count')

出来：

    1   2
0       
2021-08-13 13:00:00 1   1
2021-08-13 14:00:00 1   1
2021-08-13 15:00:00 1   1
2021-08-13 16:00:00 1   1
2021-08-13 17:00:00 1   1
2021-08-13 18:00:00 1   1
2021-08-13 19:00:00 0   0
2021-08-13 20:00:00 0   0
2021-08-13 21:00:00 0   0
2021-08-13 22:00:00 0   0

【讨论】：

我不断收到此错误：AttributeError: 'Series' object has no attribute 'set_index' 当我转换为 pandas 数据框时：仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效，但获得了 'Int64Index' 的实例
如果你有系列，我猜你不需要 set_index，你会显示你的系列样本而不是输出
你的意思是这个？：时间 2021-08-13 13:00 18 频率：H，名称：时间，数据类型：int64
只需要序列值和序列索引，至少对于 2 个样本，所以很容易应用 @Jukel series.resample('H').agg('count') 你试过这个吗，如果你的索引是时间戳：-|
我不知道我是否理解正确，但应用 series.resample('H').agg('count') 会打印：[1 1 1 1 1 1] PeriodIndex(['2021-08-13 13:00', '2021-08-13 14:00', '2021-08-13 15:00', '2021-08-13 16:00', '2021-08-13 17:00', '2021-08-13 18:00'], dtype='period[H]', name='time')