【发布时间】:2021-09-28 17:47:03
【问题描述】:
在 PostgreSQL 数据库中,我有一个如下所示的测量表:
| sensor_group_id | ts | value_1 | value_2 | etc... |
|-----------------|---------------------------|---------|---------|--------|
| 1 | 2021-07-21T00:20:00+00:00 | 15 | NULL | |
| 1 | 2021-07-15T00:20:00+00:00 | NULL | 23 | |
| 2 | 2021-07-17T00:20:00+00:00 | NULL | 11 | |
| 1 | 2021-07-13T00:20:00+00:00 | 9 | 4 | |
| 2 | 2021-07-10T00:20:00+00:00 | 99 | 36 | |
此表中有许多列具有不同类型的测量值。每个传感器组同时产生不同类型的测量值,但并不总是所有类型。 所以我们最终得到了部分填充的行。
我想做什么:
- 对于每个不同的 sensor_group_id
- 针对每个不同的列(测量类型)
- 获取该列不为 NULL 时的最新时间戳以及该时间戳处该度量的值
我现在的解决方案,看起来很麻烦:
WITH
latest_value_1 AS (SELECT DISTINCT ON (sensor_group_id) sensor_group_id, ts, value_1
FROM measurements
WHERE value_1 IS NOT NULL
ORDER BY sensor_group_id, ts DESC),
latest_value_2 AS (SELECT DISTINCT ON (sensor_group_id) sensor_group_id, ts, value_2
FROM measurements
WHERE value_2 IS NOT NULL
ORDER BY sensor_group_id, ts DESC),
latest_value_3 AS (SELECT DISTINCT ON (sensor_group_id) sensor_group_id, ts, value_3
FROM measurements
WHERE value_3 IS NOT NULL
ORDER BY sensor_group_id, ts DESC),
etc...
SELECT latest_value_1.sensor_group_id,
latest_value_1.ts AS latest_value_1_ts,
value_1,
latest_value_2.ts AS latest_value_2_ts,
value_2,
latest_value_3.ts AS latest_value_3_ts,
value_3,
etc...
FROM lastest_value_1
JOIN latest_value_2
ON latest_value_1.sensor_group_id = latest_value_2.sensor_group_id
JOIN latest_value_2
ON latest_value_1.sensor_group_id = latest_value_2.sensor_group_id
JOIN latest_value_3
ON latest_value_1.sensor_group_id = latest_value_3.sensor_group_id
etc...
这会产生以下结果:
| sensor_group_id | latest_value_1_ts | value_1 | latest_value_2_ts | value_2 | etc... |
|---|---|---|---|---|---|
| 1 | 2021-07-21T00:20:00+00:00 | 15 | 2021-07-21T00:20:00+00:00 | 23 | |
| 2 | 2021-07-10T00:20:00+00:00 | 99 | 2021-07-17T00:20:00+00:00 | 11 |
这看起来异常复杂,但我不确定是否有更好的方法。非常感谢您的帮助!
【问题讨论】:
标签: sql postgresql greatest-n-per-group