PostgreSQL：根据排序顺序仅选择每个 id 的第一条记录答案

【问题标题】：PostgreSQL: Select only the first record per id based on sort orderPostgreSQL：根据排序顺序仅选择每个 id 的第一条记录
【发布时间】：2022-01-08 08:59:38
【问题描述】：

对于以下查询，我只需要选择具有最低 shape_type 值（范围从 1 到 10）的第一条记录。如果你有任何关于如何轻松做到这一点的知识是 postgresql，请帮助。谢谢你的时间。

select g.geo_id, gs.shape_type
from schema.geo g   
join schema.geo_shape gs on (g.geo_id=gs.geo_id)  
order by gs.shape_type asc;

【问题讨论】：

标签： sql postgresql distinct greatest-n-per-group distinct-on

【解决方案1】：

PostgreSQL 对这类查询有非常好的语法 - distinct on:

SELECT DISTINCT ON (表达式 [, ...] ) 只保留第一行给定表达式计算结果为等于的每组行。这 DISTINCT ON 表达式使用与 for 相同的规则进行解释订购（见上文）。请注意，每组的“第一行”是不可预测，除非使用 ORDER BY 来确保所需的行首先出现。

所以你的查询变成：

select distinct on(g.geo_id)
    g.geo_id, gs.shape_type
from schema.geo g   
    join schema.geo_shape gs on (g.geo_id=gs.geo_id)  
order by g.geo_id, gs.shape_type asc;

一般的 ANSI-SQL 语法（在任何具有窗口函数和公共表表达式的 RDBMS 中，可以切换到子查询）将是：

with cte as (
    select
        row_number() over(partition by g.geo_id order by gs.shape_type) as rn,
        g.geo_id, gs.shape_type
    from schema.geo g   
        join schema.geo_shape gs on (g.geo_id=gs.geo_id)  
)
select
    geo_id, shape_type
from cte
where rn = 1

【讨论】：

这很好。我无法弄清楚如何通过两个字段进行操作，但我只是简单地将这两个字段连接起来并将其用作查询的 distinct on 部分的键。
@theStud54 你可以 distinct on 按记录类型：distinct on ((g.geo_id, g.geo_id2)) - 参见简单的 db fiddle 示例 - dbfiddle.uk/…
啊，谢谢@roman_pekar 的评论。这帮助我修复了除此之外的另一个项目！为我的查询节省了 40% 的时间