【发布时间】:2026-01-22 12:35:01
【问题描述】:
我有一个dataframe 看起来像这样,
df = pd.DataFrame({'col1':range(9), 'col2': list(range(6)) + [np.nan] *3},
index = pd.date_range('1/1/2000', periods=9, freq='T'))
df
Out[63]:
col1 col2
2000-01-01 00:00:00 0 0.0
2000-01-01 00:01:00 1 1.0
2000-01-01 00:02:00 2 2.0
2000-01-01 00:03:00 3 3.0
2000-01-01 00:04:00 4 4.0
2000-01-01 00:05:00 5 5.0
2000-01-01 00:06:00 6 NaN
2000-01-01 00:07:00 7 NaN
2000-01-01 00:08:00 8 NaN
当我通过方法last 执行resample 时,
df.resample('3T', label='right', closed='right').last()
Out[60]:
col1 col2
2000-01-01 00:00:00 0 0.0
2000-01-01 00:03:00 3 3.0
2000-01-01 00:06:00 6 5.0
2000-01-01 00:09:00 8 NaN
如上可见,6th minute 行有col1 上的数据,所以重采样后,col1 填充有6th minute 行上的数据,但col2 填充有5th minute 行,有没有办法确保重采样后的两个数据都来自6th minute 行,这意味着如果col1 有数据,重采样将不会用最后一个填充col2 的NaN,而是保持原样?
Out[60]:
col1 col2
2000-01-01 00:00:00 0 0.0
2000-01-01 00:03:00 3 3.0
2000-01-01 00:06:00 6 NaN <--- if there at least one col has data,the whole row will be used in resample
2000-01-01 00:09:00 8 NaN
【问题讨论】: