【发布时间】:2019-05-16 16:15:43
【问题描述】:
我在 Postgres 9.4.5 中有一个大的时间戳表:
CREATE TABLE vessel_position (
posid serial NOT NULL,
mmsi integer NOT NULL,
"timestamp" timestamp with time zone,
the_geom geometry(PointZ,4326),
CONSTRAINT "PK_posid_mmsi" PRIMARY KEY (posid, mmsi)
);
附加索引:
CREATE INDEX vessel_position_timestamp_idx ON vessel_position ("timestamp");
我想提取时间戳在前一行之后至少 x 分钟的每一行。我使用LAG() 尝试了一些不同的SELECT 语句,这些语句都有效,但没有给我我需要的确切结果。下面的函数给了我我需要的东西,但我觉得它可以更快:
CREATE OR REPLACE FUNCTION _getVesslTrackWithInterval(mmsi integer, startTime character varying (25) ,endTime character varying (25), interval_min integer)
RETURNS SETOF vessel_position AS
$func$
DECLARE
count integer DEFAULT 0;
posids varchar DEFAULT '';
tbl CURSOR FOR
SELECT
posID
,EXTRACT(EPOCH FROM (timestamp - lag(timestamp) OVER (ORDER BY posid asc)))::int as diff
FROM vessel_position vp WHERE vp.mmsi = $1 AND vp.timestamp BETWEEN $2::timestamp AND $3::timestamp;
BEGIN
FOR row IN tbl
LOOP
count := coalesce(row.diff,0) + count;
IF count >= $4*60 OR count = 0 THEN
posids:= posids || row.posid || ',';
count:= 0;
END IF;
END LOOP;
RETURN QUERY EXECUTE 'SELECT * from vessel_position where posid in (' || TRIM(TRAILING ',' FROM posids) || ')';
END
$func$ LANGUAGE plpgsql;
我不禁想到将所有posids 作为一个字符串,然后在最后再次选择它们会减慢速度。
在IF 语句中,我已经可以访问我想要保留的每一行,因此可以将它们存储在临时表中,然后在循环结束时返回临时表。
可以优化此功能 - 特别是提高性能吗?
【问题讨论】:
-
About the outdated Postgres version 9.4.5:
We always recommend that all users run the latest available minor release for whatever major version is in use.
标签: sql postgresql plpgsql window-functions gaps-and-islands