从表中获取值子集的最有效方法？答案

【问题标题】：Most efficient way to get a subset of values from a table?从表中获取值子集的最有效方法？
【发布时间】：2021-11-04 03:05:43
【问题描述】：

我有下表：

CREATE TABLE table1 
(
    accountNumber,
    event, --can be anything
    day_id --date
);

INSERT INTO table1 
VALUES (123, 'start', 20211010),
       (123, 'finish', 20211010),
       (123, 'finish', 20211010)
       (123, 'jump', 20211010),
       (124, 'run', 20211011),
       (155, 'skip' 20211010);

可以有任何重复的组合。

我想从表中获取一个子集，该子集是所有 accountNumbers 的列表，其中 DID 开始和结束，但没有跳转。为此，我有以下几点：

with cte as (select accountNumber from table1 where event = 'start'),--cte with all accountNumbers that had 'start' at one point
cte2 as (select accountNumber from table1 where event = 'finish'), --cte with all accountNumbers that had 'finish' at one point
cte3 as (select accountNumber from table1 where = 'jump') --cte with all accountNumbers that jumped

select distinct accountNumber from table1
where accountNumber in (select * from cte)
and accountNumber in (select * from cte2)
and accountNumber not in (select * from cte3);

这很慢。特别是如果我扩大规模以包含更多条件。有没有更有效的方法来做到这一点？

【问题讨论】：

标签： sql performance dbeaver

【解决方案1】：

您想要所有具有 start 和 finish 事件但不具有 jump 事件的 AccountNumber。所以只需试试这个：

select Distinct accountNumber from table1 t1
where (event = 'start' or event = 'finish') 
And not Exists (select accountNumber from table1 t2 where event = 'jump' and t1.accountNumber = t2.accountNumber)

结果等于您的查询。

【讨论】：