如何提高我的 postgres 查询的选择速度？答案

【问题标题】：How can I improve the select speed of my postgres query?如何提高我的 postgres 查询的选择速度？
【发布时间】：2021-04-14 10:35:54
【问题描述】：

在我的项目中，我有一个非常简单的表格，如下所示：

create table entity
(
    id integer default 1,
    session_id varchar not null,
    type integer not null,
    category integer not null,
    created timestamp default now() not null
)
with (autovacuum_enabled=false);


create index created_index
    on entity (created);

我还有一个视图，它选择最近 30 年代的条目的分组结果，如下所示：

create view list(type, category, counter) as
    SELECT 
        type,
        category, 
        count(entity.id) AS counter
    FROM entity
    WHERE entity.created >= (now() - '00:00:30'::interval)
    GROUP BY entity.type, entity.category;

由于表没有发生更新或删除，我已将其设置为 unlogged 并禁用了 auto_vaccuum。

该表现在大约有 20 个mio 条目，SELECT type, category, counter FROM list 的平均选择时间约为 2 秒。

有什么我可以优化以加快选择的速度吗？或者当前速度已经是人们可以从这么大的表中获得的最大速度了吗？

编辑：

这是EXPLAIN的输出：

Subquery Scan on list  (cost=9.37..9.73 rows=18 width=16) (actual time=425.268..425.278 rows=24 loops=1)
"  Output: list.type, list.category, list.counter “
  Buffers: shared hit=169485
  ->  HashAggregate  (cost=9.37..9.55 rows=18 width=16) (actual time=425.267..425.272 rows=24 loops=1)
"        Output: entity.type, entity.category, count(entity.id)
"        Group Key: entity.type, entity.category
        Buffers: shared hit=169485
"        ->  Index Scan using created_index on entity  (cost=0.57..9.13 rows=32 width=12) (actual time=0.050..228.416 rows=165470 loops=1)"
"              Output: entity.id, entity.session_id, entity.type, entity.category, entity.created"
              Index Cond: (entity.created >= (now() - '00:00:30'::interval))
              Buffers: shared hit=169485
Planning Time: 0.204 ms
Execution Time: 425.327 ms

执行时间看起来不错，但这是在系统静止时执行的。通常每秒大约有 1000 次插入到表中。

关于自动真空，这是一次绝望的尝试，看看它是否能改善任何东西。我应该重新启用它吗？

【问题讨论】：

您能否向我们展示 EXPLAIN(ANALYZE, VERBOSE, BUFFERS) 的查询结果？你有关于 entity.type 和 entity.category 的索引吗？
with (autovacuum_enabled=false) - 为什么，哦，为什么？这是一个非常糟糕的主意。
我在帖子中添加了上面的信息
@FrankHeikens 我只创建了索引
在实体上创建索引（创建、类型、类别）；也可能有帮助

标签： postgresql indexing query-optimization

【解决方案1】：

这是covering index 的工作。如果您创建一个可以满足整个查询的复合索引，您将有机会以不同的方式执行昂贵的 HashAggregate。

覆盖索引通常最适合一组有限的查询。给你的就是这个。

CREATE INDEX entity_cr_ty_ca_id ON entity(created, type, category) INCLUDE (id);

这很好用，因为查询可以 ....

随机访问索引到第一个符合条件的created 值。
按顺序扫描索引。这是一个 B-TREE 索引，所以 type 和 category 的值按有用的顺序排列。
在执行 COUNT(*) 之前，从索引中拉出 id 值以检查它是否为空。

如果您知道id 值永远不会为空，您可以简化此操作。使用COUNT(*) 代替COUNT(entity.id)。并将id 留在索引之外，而是像这样创建它。

CREATE INDEX entity_cr_ty_ca ON entity(created, type, category);

而且，必须说：即使您让您的 dbms 快速生成一个大型结果集，它仍然必须传输到请求它的程序并由其解释。没有任何索引魔法可以让这变得更快。

【讨论】：