【问题标题】:PostgreSQL: No partition pruning when initializing query plan with LATERAL JOIN and aggregatesPostgreSQL:使用 LATERAL JOIN 和聚合初始化查询计划时没有分区修剪
【发布时间】:2020-05-07 08:39:59
【问题描述】:

我想知道为什么分区修剪不适用于 LATERAL JOIN 和聚合。 下面是我在 PostgreSQL 12.1 (Ubuntu 12.1-1.pgdg18.04+1) 上测试的示例:

CREATE TABLE demo(key BIGINT) PARTITION BY LIST (key);
CREATE TABLE demo_key_1 PARTITION OF demo FOR VALUES IN (1);
CREATE TABLE demo_key_2 PARTITION OF demo FOR VALUES IN (2);
INSERT INTO demo(key) VALUES (1), (2);
ANALYZE demo;



CREATE TABLE demo2(key BIGINT) PARTITION BY LIST (key);
CREATE TABLE demo2_key_1 PARTITION OF demo2 FOR VALUES IN (1);
CREATE TABLE demo2_key_2 PARTITION OF demo2 FOR VALUES IN (2);
INSERT INTO demo2(key) VALUES (1), (2);
ANALYZE demo2;

现在,如果 LATERAL JOIN 下的 SELECT 中没有聚合,则一切 按预期工作 - 每个表只扫描一个分区:

EXPLAIN ANALYZE
         SELECT * FROM demo
         JOIN LATERAL (
                  SELECT key AS key2
                  FROM demo2
                  WHERE demo2.key = demo.key
         ) d ON TRUE
         WHERE demo.key = 1;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2.03 rows=1 width=16) (actual time=0.007..0.008
rows=1 loops=1)
   ->  Seq Scan on demo_key_1  (cost=0.00..1.01 rows=1 width=8) (actual
time=0.004..0.005 rows=1 loops=1)
         Filter: (key = 1)
   ->  Seq Scan on demo2_key_1  (cost=0.00..1.01 rows=1 width=8) (actual
time=0.001..0.001 rows=1 loops=1)
         Filter: (key = 1)
 Planning Time: 0.191 ms
 Execution Time: 0.025 ms
(7 rows)

但是,当我尝试包含聚合的非常相似的查询时 函数,demo2 的分区不会从查询计划中删除:

EXPLAIN ANALYZE
         SELECT * FROM demo
         JOIN LATERAL (
                  SELECT sum(demo2.key) AS sum2
                  FROM demo2
                  WHERE demo2.key = demo.key
         ) d ON TRUE
         WHERE demo.key = 1;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=2.03..3.07 rows=1 width=40) (actual time=0.018..0.018
rows=1 loops=1)
   ->  Seq Scan on demo_key_1  (cost=0.00..1.01 rows=1 width=8) (actual
time=0.005..0.005 rows=1 loops=1)
         Filter: (key = 1)
   ->  Aggregate  (cost=2.03..2.04 rows=1 width=32) (actual
time=0.011..0.011 rows=1 loops=1)
         ->  Append  (cost=0.00..2.03 rows=2 width=8) (actual
time=0.004..0.005 rows=1 loops=1)
               ->  Seq Scan on demo2_key_1  (cost=0.00..1.01 rows=1
width=8) (actual time=0.002..0.002 rows=1 loops=1)
                     Filter: (key = demo_key_1.key)
               ->  Seq Scan on demo2_key_2  (cost=0.00..1.01 rows=1
width=8) (never executed)
                     Filter: (key = demo_key_1.key)
 Planning Time: 0.174 ms
 Execution Time: 0.082 ms
(11 rows)

当然,demo2_key_2 上的 Seq Scan 从未执行过,但为什么不执行 从查询计划中删减?更复杂的查询,包含数百个 分区会受到严重影响。

解决方法是在子查询中添加一个冗余条件:

EXPLAIN ANALYZE
         SELECT * FROM demo
         JOIN LATERAL (
                  SELECT sum(demo2.key) AS sum2
                  FROM demo2
                  WHERE demo2.key = 1 AND demo2.key = demo.key
         ) d ON TRUE
         WHERE demo.key = 1;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1.01..2.05 rows=1 width=40) (actual time=0.011..0.011
rows=1 loops=1)
   ->  Seq Scan on demo_key_1  (cost=0.00..1.01 rows=1 width=8) (actual
time=0.004..0.004 rows=1 loops=1)
         Filter: (key = 1)
   ->  Aggregate  (cost=1.01..1.02 rows=1 width=32) (actual
time=0.006..0.006 rows=1 loops=1)
         ->  Result  (cost=0.00..1.01 rows=1 width=8) (actual
time=0.002..0.002 rows=1 loops=1)
               One-Time Filter: (demo_key_1.key = 1)
               ->  Seq Scan on demo2_key_1  (cost=0.00..1.01 rows=1
width=8) (actual time=0.001..0.002 rows=1 loops=1)
                     Filter: (key = 1)
 Planning Time: 0.079 ms
 Execution Time: 0.031 ms
(10 rows)

【问题讨论】:

    标签: sql postgresql aggregate-functions partitioning lateral-join


    【解决方案1】:

    优化器不够聪明,无法推断它可以跳过分区。但由于 v12 中有运行时分区修剪,因此执行器会跳过扫描分区。

    这应该不会对查询性能产生明显的负面影响。

    优化器的智能是有限度的。如果要进行分区修剪,最好使用简单条件。

    【讨论】:

      猜你喜欢
      • 2015-04-17
      • 2020-01-28
      • 2016-03-24
      • 2013-07-22
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多