如何在 postgresql 中查询时间序列数据以查找峰值答案

【问题标题】：how to query time-series data in postgresql to find spikes如何在 postgresql 中查询时间序列数据以查找峰值
【发布时间】：2021-08-29 17:23:06
【问题描述】：

我有一个名为 cpu_usages 的表，我正试图找出 cpu 使用率的峰值。我的表存储 4 列：

id serial
at timestamp
cpu_usage float
cpu_core int

at 列存储每天每分钟的时间戳。我想选择我获取每个时间戳并获取接下来 3 分钟的所有行，如果任何时间戳的 cpu_value 比该时间戳的起始值至少高 3%，则返回它

例如，如果我有这些行：

id|at|cpu_values,cpu_core
1 | 2019-01-01-00:00|1|0
2 | 2019-01-01-00:01|1|0
3 | 2019-01-01-00:02|4|0
4 | 2019-01-01-00:03|1|0
5 | 2019-01-01-00:04|1|0
6 | 2019-01-01-00:05|1|0
7 | 2019-01-01-00:06|1|0
8 | 2019-01-01-00:07|1|0
9 | 2019-01-01-00:08|6|0
10 | 2019-01-01-00:00|1|1
11 | 2019-01-01-00:01|1|1
12| 2019-01-01-00:02|4|1
13 | 2019-01-01-00:03|1|1
14 | 2019-01-01-00:04|1|1
15 | 2019-01-01-00:05|1|1
16 | 2019-01-01-00:06|1|1
17 | 2019-01-01-00:07|1|1
18 | 2019-01-01-00:08|6|1

它将返回行： 1,2,6,7,8

我不知道该怎么做，因为听起来它需要某种嵌套连接。

谁能帮我解决这个问题？

【问题讨论】：

什么是“它”？第一个时间戳还是更大的时间戳？
第一个时间戳

标签： sql postgresql time-series

【解决方案1】：

这回答了问题的原始版本。

只需使用窗口函数。假设你想要更大的值，那么你想要回顾而不是向前：

select t.*
from (select t.*,
             max(cpu_value) over (order by timestamp
                                  range between interval '3 minute' preceding and interval '1 second' preceding
                                 ) as previous_min
      from t
     ) t
where previous_min * 1.03 < cpu_value;

编辑：

往回看，这将是：

select t.*
from (select t.*,
             min(cpu_value) over (order by timestamp
                                  range between interval '1 second' following and interval '3 minute' following
                                 ) as next_min
      from t
     ) t
where cpu_value * 1.03 > next_min;

【讨论】：

谢谢。我确实想继续前进，因为我正在尝试获取开始数据并查看是否可以预测 cpu 使用情况。所以我需要峰值发生之前的开始时间，而不是最大值。 Windows 功能可以做到这一点吗？
@jas 。 . .这是相同的想法。我也将该版本添加到答案中。
我意识到我遗漏了一条信息。我忘记了 cpu_core 列，所以我试图为每个 cpu 核心获取这些数据。我不知道如何修改查询以使其与 cpu_core 一起使用，这可能吗？