更新查询优化 postgresql答案

【问题标题】：update query optimization postgresql更新查询优化 postgresql
【发布时间】：2021-07-10 06:55:02
【问题描述】：

我正在将数据库从 oracle 迁移到 postgresql，但我遇到了更新查询的性能问题。

explain  update airepp.EQP_CALC_STAT_EVENEMENT e  set EVT_COM_POPULATION=(
        SELECT    f.COM_RECENSEMENT_DER_POPULATION
        FROM      airepp.EQP_FOURNISSEUR f where e.EVT_LIEU_LOC_CODE = f.COM_CODE)
     ;

在 oracle 中这个查询大约需要 5 分钟，在 postgresql 中需要 55 分钟，它们具有相同的索引和确切的字段，这是 postgresql 和 oracle 的解释这个查询

甲骨文计划：

我试过这个，但在 66 分钟后更糟

explain update airepp.EQP_CALC_STAT_EVENEMENT e
    set EVT_COM_POPULATION = f.COM_RECENSEMENT_DER_POPULATION    
    FROM airepp.EQP_FOURNISSEUR f 
    where e.EVT_LIEU_LOC_CODE = f.COM_CODE;

有没有其他方法可以以更优化的方式编写此查询以获得与 oracle 或 near 相同的结果？

【问题讨论】：

参考表的基数是多少？ 3700 万次索引扫描在任何数据库中都不是很好。
怎么看？
你确定你的两个数据库有相同的数据吗？ （您没有一台 40M 行和一台 10M 行？） 并且您拥有相同的硬件和配置？ （您的一个上没有旋转磁盘，另一个上没有 SSD，或者分区不同，或者不同驱动器上的日志文件，CPU 或内存的差异等等等等） 您可能需要让您的 DBA 提供每个服务器的完整资源描述。
它们是完全相同的数据库，具有相同的资源
@astentx 实际上 postgresql 表包含 36 m 而在 oracle 中是 10 m 但是这可以解释 postgresql 中 55 分钟的执行时间以及 oracle 中的 5 分钟吗？

标签： sql postgresql join sql-update

【解决方案1】：

您可能会因为大量查找而面临这个问题。对于基数远小于主表的参考表，您可以尝试使用可以利用哈希联接的update ... from...。

示例如下：

create table t
as
select v::text as some_code, random() as some_value
from generate_series(1, 20000) as v

create unique index t_idx on t(some_code)

create table t_big
as
select
  v as id,
  trunc(v / 10)::text as some_code,
  null as some_value
from generate_series(1, 100000) v

explain analyze
update t_big
set some_value = t.some_value
from t
where t_big.some_code = t.some_code
|查询计划 | | :------------------------------------------------ -------------------------------------------------- ------------------- | |更新 t_big (cost=559.00..1885.53 rows=45135 width=80) (实际时间=3908.306..3908.309 rows=0 loops=1) | | -> Hash Join (cost=559.00..1885.53 rows=45135 width=80) (实际时间=7.715..1709.967 rows=99991 loops=1) | |哈希条件：(t_big.some_code = t.some_code) | | -> Seq Scan on t_big (cost=0.00..982.35 rows=45135 width=42) (实际时间=0.048..390.277 rows=100000 loops=1) | | -> 哈希（成本=309.00..309.00 行=20000 宽度=46）（实际时间=7.564..7.565 行=20000 循环=1）| |存储桶：32768 批次：1 内存使用量：1261kB | | -> Seq Scan on t (cost=0.00..309.00 rows=20000 width=46) (实际时间=0.017..3.439 rows=20000 loops=1) | |规划时间：0.678 ms | |执行时间：3908.437 毫秒 |

explain analyze
update t_big
set some_value = (
  select some_value from t where t.some_code = t_big.some_code
)
|查询计划 | | :------------------------------------------------ -------------------------------------------------- ---------------------- | |更新 t_big (cost=0.00..1054578.00 rows=126600 width=46) (实际时间=7759.679..7759.680 rows=0 loops=1) | | -> Seq Scan on t_big (cost=0.00..1054578.00 rows=126600 width=46) (实际时间=0.086..5146.080 rows=100000 loops=1) | |子计划 1 | | -> Index Scan using t_idx on t (cost=0.29..8.30 rows=1 width=8) (实际时间=0.027..0.028 rows=1 loops=100000) | |指数条件：（some_code = t_big.some_code）| |规划时间：0.217 ms | |执行时间：7759.737 ms |

db小提琴here

【讨论】：

感谢您的回复，但我已经尝试过使用 join 的查询，它需要更多时间来执行......在我的情况下