【发布时间】:2016-10-17 07:28:35
【问题描述】:
我有以下以秒为单位记录的数据:http://pastebin.com/wBSJWYn2
我想以 1 分钟的间隔捕获各种夏季统计数据,例如平均值、方差等。所以我在sensor_data.rolling(window=1,freq="1MIN") 上运行这些功能。在大多数情况下,它运行良好,但对于某些类型的功能,我无法克服两种类型的不规则性。具体来说,要么:
- 对于不完整的分钟没有输出 -- 对于没有全部 60 秒的分钟,它没有给出输出。
mean(), quantile(), sum()就是这种情况 - 根本没有输出。对于像
var(), std(), kurt(), skew()这样的某些函数,我根本没有得到任何值。我真的不明白为什么会这样,因为它能够计算平均值......
其他功能似乎没有问题:max(), median(), min()
我真的很关心第二个问题,但如果能找到第一个问题的解决方法也是一个好处……
sensor_data.head()
x_acceleration y_acceleration z_acceleration heart_rate electrodermal_activity temperature
index
2016-05-16 06:58:44 -33.25000 -43.03125 33.09375 NaN 0.297099 33.33
2016-05-16 06:58:45 -28.15625 -52.90625 24.12500 NaN 0.219612 33.33
2016-05-16 06:58:46 -25.87500 -55.96875 21.18750 NaN 0.222648 33.33
2016-05-16 06:58:47 -24.00000 -57.46875 19.40625 NaN 0.217335 33.33
2016-05-16 06:58:48 -22.84375 -56.25000 23.40625 NaN 0.214300 33.33
第一种情况的示例输出——不完整的分钟没有输出:
sensor_data.rolling(window=1,freq="1MIN").mean().head()
x_acceleration y_acceleration z_acceleration heart_rate electrodermal_activity temperature
index
2016-05-16 06:58:00 NaN NaN NaN NaN NaN NaN
2016-05-16 06:59:00 -24.84375 -59.46875 9.03125 68.57 0.208988 33.75
2016-05-16 07:00:00 6.31250 -62.78125 6.46875 79.40 0.224924 33.84
2016-05-16 07:01:00 -21.18750 -57.00000 22.50000 92.00 0.224165 34.13
2016-05-16 07:02:00 -17.46875 -58.87500 21.84375 81.10 0.224165 34.25
第二种情况的示例输出——无输出:
sensor_data.rolling(window=1,freq="1MIN").var().head()
x_acceleration y_acceleration z_acceleration heart_rate electrodermal_activity temperature
index
2016-05-16 06:58:00 NaN NaN NaN NaN NaN NaN
2016-05-16 06:59:00 NaN NaN NaN NaN NaN NaN
2016-05-16 07:00:00 NaN NaN NaN NaN NaN NaN
2016-05-16 07:01:00 NaN NaN NaN NaN NaN NaN
2016-05-16 07:02:00 NaN NaN NaN NaN NaN NaN
【问题讨论】:
标签: python pandas dataframe time-series nan