【问题标题】:Optimizing postgres query优化 postgres 查询
【发布时间】:2012-04-16 10:32:17
【问题描述】:
                                 QUERY PLAN                                   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=32164.87..32164.89 rows=1 width=44) (actual time=221552.831..221552.831 rows=0 loops=1)
   ->  Sort  (cost=32164.87..32164.87 rows=1 width=44) (actual time=221552.827..221552.827 rows=0 loops=1)
         Sort Key: t.date_effective, t.acct_account_transaction_id, p.method, t.amount, c.business_name, t.amount
         ->  Nested Loop  (cost=22871.67..32164.86 rows=1 width=44) (actual time=221552.808..221552.808 rows=0 loops=1)
               ->  Nested Loop  (cost=22871.67..32160.37 rows=1 width=52) (actual time=221431.071..221546.619 rows=670 loops=1)
                     ->  Nested Loop  (cost=22871.67..32157.33 rows=1 width=43) (actual time=221421.218..221525.056 rows=2571 loops=1)
                           ->  Hash Join  (cost=22871.67..32152.80 rows=1 width=16) (actual time=221307.382..221491.019 rows=2593 loops=1)
                                 Hash Cond: ("outer".acct_account_id = "inner".acct_account_fk)
                                 ->  Seq Scan on acct_account a  (cost=0.00..7456.08 rows=365008 width=8) (actual time=0.032..118.369 rows=61295 loops=1)
                                 ->  Hash  (cost=22871.67..22871.67 rows=1 width=16) (actual time=221286.733..221286.733 rows=2593 loops=1)
                                       ->  Nested Loop Left Join  (cost=0.00..22871.67 rows=1 width=16) (actual time=1025.396..221266.357 rows=2593 loops=1)
                                             Join Filter: ("inner".orig_acct_payment_fk = "outer".acct_account_transaction_id)
                                             Filter: ("inner".link_type IS NULL)
                                             ->  Seq Scan on acct_account_transaction t  (cost=0.00..18222.98 rows=1 width=16) (actual time=949.081..976.432 rows=2596 loops=1)
                                                   Filter: ((("type")::text = 'debit'::text) AND ((transaction_status)::text = 'active'::text) AND (date_effective >= '2012-03-01'::date) AND (date_effective < '2012-04-01 00:00:00'::timestamp without time zone))
                                             ->  Seq Scan on acct_payment_link l  (cost=0.00..4648.68 rows=1 width=15) (actual time=1.073..84.610 rows=169 loops=2596)
                                                   Filter: ((link_type)::text ~~ 'return_%'::text)
                           ->  Index Scan using contact_pk on contact c  (cost=0.00..4.52 rows=1 width=27) (actual time=0.007..0.008 rows=1 loops=2593)
                                 Index Cond: (c.contact_id = "outer".contact_fk)
                     ->  Index Scan using acct_payment_transaction_fk on acct_payment p  (cost=0.00..3.02 rows=1 width=13) (actual time=0.005..0.005 rows=0 loops=2571)
                           Index Cond: (p.acct_account_transaction_fk = "outer".acct_account_transaction_id)
                           Filter: ((method)::text <> 'trade'::text)
               ->  Index Scan using contact_role_pk on contact_role  (cost=0.00..4.48 rows=1 width=4) (actual time=0.007..0.007 rows=0 loops=670)
                     Index Cond: ("outer".contact_id = contact_role.contact_fk)
                     Filter: (exchange_fk = 74)
Total runtime: 221553.019 ms

【问题讨论】:

  • 这是更易读的计划:explain.depesz.com/s/12r
  • 您应该重写您的 SQL 查询以使用显式 JOIN 语法。您正在混合隐式连接和显式连接,这是一个坏主意。
  • FROM ... ,acct_account a, acct_payment p, ... 我没有看到这些表的任何连接字段。可能会产生笛卡尔积。
  • 除非表为空 (actual rows=0),否则它不是笛卡尔连接。

标签: sql postgresql sql-execution-plan


【解决方案1】:

你的问题在这里:

->  Nested Loop Left Join  (cost=0.00..22871.67 rows=1 width=16) (actual time=1025.396..221266.357 rows=2593 loops=1)
    Join Filter: ("inner".orig_acct_payment_fk = "outer".acct_account_transaction_id)
    Filter: ("inner".link_type IS NULL)
        ->  Seq Scan on acct_account_transaction t  (cost=0.00..18222.98 rows=1 width=16) (actual time=949.081..976.432 rows=2596 loops=1)
                Filter: ((("type")::text = 'debit'::text) AND ((transaction_status)::text = 'active'::text) AND (date_effective >= '2012-03-01'::date) AND (date_effective   
            Seq Scan on acct_payment_link l  (cost=0.00..4648.68 rows=1 width=15) (actual time=1.073..84.610 rows=169 loops=2596)
                Filter: ((link_type)::text ~~ 'return_%'::text)

它希望在 acct_account_transaction 中找到 1 行,而它找到 2596,对于另一个表也是如此。

您没有提到您的 postgres 版本(可以吗?),但这应该可以解决问题:

SELECT DISTINCT
    t.date_effective,
    t.acct_account_transaction_id,
    p.method,
    t.amount,
    c.business_name,
    t.amount
FROM
    contact c inner join contact_role on (c.contact_id=contact_role.contact_fk and contact_role.exchange_fk=74),
    acct_account a, acct_payment p,
    acct_account_transaction t
WHERE
    p.acct_account_transaction_fk=t.acct_account_transaction_id
    and t.type = 'debit'
    and transaction_status = 'active'
    and p.method != 'trade'
    and t.date_effective >= '2012-03-01'
    and t.date_effective < (date '2012-03-01' + interval '1 month')
    and c.contact_id=a.contact_fk and a.acct_account_id = t.acct_account_fk
    and not exists(
         select * from acct_payment_link l 
           where orig_acct_payment_fk == acct_account_transaction_id 
           and link_type like 'return_%'
    )
ORDER BY
    t.date_effective DESC

另外,请尝试为相关列设置适当的统计目标。友情手册链接:http://www.postgresql.org/docs/current/static/sql-altertable.html

【讨论】:

    【解决方案2】:

    你的索引是什么,你最近分析过吗?它正在对acct_account_transaction 进行表扫描,即使该表有多个条件:

    • 类型
    • 生效日期

    如果这些列上没有索引,那么复合一个 (type, date_effective) 可能会有所帮助(假设有很多行不符合这些列上的条件)。

    【讨论】:

      【解决方案3】:

      我删除了我的第一个建议,因为它改变了查询的性质。

      我发现在LEFT JOIN 上花费了太多时间。

      1. 首先尝试对acct_payment_link 表进行一次扫描。您能否尝试将您的查询重写为:

        ... LEFT JOIN (SELECT * FROM acct_payment_link
                       WHERE link_type LIKE 'return_%') AS l ...
        
      2. 您应该检查您的统计数据,因为计划的行数和返回的行数之间存在差异。

      3. 您还没有包含表和索引的定义,最好看看这些。

      4. 您可能还想使用contrib/pg_tgrm 扩展在acct_payment_link.link_type 上构建索引,但我会将此作为最后一个尝试的选项。

      顺便说一句,您使用的 PostgreSQL 版本是什么?

      【讨论】:

      • 但这会有效地将左连接更改为内连接
      【解决方案4】:

      你的陈述被重写和格式化:

      SELECT DISTINCT
             t.date_effective,
             t.acct_account_transaction_id,
             p.method,
             t.amount,
             c.business_name,
             t.amount
      FROM   contact                  c
      JOIN   contact_role            cr ON cr.contact_fk = c.contact_id
      JOIN   acct_account             a ON a.contact_fk = c.contact_id 
      JOIN   acct_account_transaction t ON t.acct_account_fk = a.acct_account_id 
      JOIN   acct_payment             p ON p.acct_account_transaction_fk
                                         = t.acct_account_transaction_id
      LEFT   JOIN acct_payment_link   l ON orig_acct_payment_fk
                                         = acct_account_transaction_id
                                              -- missing table-qualification!
                                       AND link_type like 'return_%'
                                              -- missing table-qualification!
      WHERE  transaction_status = 'active'    -- missing table-qualification!
      AND    cr.exchange_fk = 74
      AND    t.type = 'debit'
      AND    t.date_effective >= '2012-03-01'
      AND    t.date_effective <  (date '2012-03-01' + interval '1 month')
      AND    p.method != 'trade'
      AND    l.link_type IS NULL
      ORDER  BY t.date_effective DESC;
      
      • 最好使用显式 JOIN 语句。我根据您的 JOIN 逻辑重新排序了您的表格。

      • 为什么是(date '2012-03-01' + interval '1 month') 而不是日期'2012-04-01'

      • 缺少某些表限定条件。在这样一个复杂的陈述中,这是一种糟糕的风格。可能隐藏了一个错误。

      性能的关键是适当的索引、PostgreSQL的正确配置和准确的统计数据

      General advice on performance tuning in the PostgreSQL wiki.

      【讨论】:

        猜你喜欢
        • 2022-01-24
        • 2011-02-13
        • 1970-01-01
        • 2015-09-13
        • 2021-08-31
        • 2017-12-25
        • 2021-05-04
        • 1970-01-01
        相关资源
        最近更新 更多