【问题标题】:Oracle query optimization recommendationOracle 查询优化建议
【发布时间】:2020-11-21 12:22:36
【问题描述】:

下面的查询只需要很长时间,下面的谓词仅用于获取唯一记录,因此想知道是否有不同的方法来重写相同的查询,而无需多次调用下面的谓词来获取唯一 ID。

select max(c.id) from plocation c where c.ids = y.ids and c.idc = y.idc)
select max(cr.id) from plocation_log cr where cr.ids = yt.ids and cr.idc = yt.idc)
select max(pr.id) from patentpr where pr.ids = p.ids and pr.idc = p.idc)           

我的完整示例查询

SELECT to_char(p.pid) AS patentid,
       p.name,
       x.dept,
       y.location
  FROM patent p
  JOIN pdetails x ON p.pid = x.pid  AND x.isactive = 1
  JOIN plocation y
            ON y.idr = p.idr
           AND y.idc = p.idc
           AND y.id = *(select max(c.id) from plocation c where c.ids = y.ids and c.idc = y.idc)*
           AND y.idopstype in (36, 37)
   JOIN plocation_log yt
            ON yt.idr = y.idr
           AND yt.idc= y.idc
           AND yt.id = *(select max(cr.id) from plocation_log cr where cr.ids = yt.ids and cr.idc = yt.idc)*
           AND yt.idopstype in (36,37)
WHERE
      p.idp IN (10,20,30)
   AND p.id = *(select max(pr.id) from patent pr where pr.ids = p.ids and pr.idc = p.idc)*
   AND p.idopstype in (36,37)

【问题讨论】:

  • 能否详细介绍一下表定义?
  • 如果您使用窗口函数而不是相关子查询来改写查询,它将运行得更快。您的 Oracle 版本是否支持 windows 函数(OVER 子句)?
  • Oracle 从 8i 开始支持窗口分析功能 :) 我认为没有人使用旧版本
  • 向我们展示表格上的索引。
  • @TheImpaler 谢谢。我喜欢这个建议,您是否碰巧有一个带有窗口函数的示例,而不是我的实例的相关子查询,以获取唯一的 ID?

标签: sql oracle query-optimization greatest-n-per-group window-functions


【解决方案1】:

正如 The Impaler 所评论的,一种选择是使用分析函数而不是相关子查询。这个想法是使用RANK() 对子查询中的记录进行排名,然后在外部查询中过滤(连接条件或WHERE 子句)。

考虑:

SELECT to_char(p.pid) AS patentid,
       p.name,
       x.dept,
       y.location
  FROM (SELECT p.*, RANK() OVER(PARTITION BY ids, idc ORDER BY id) rn FROM patinet) p
  JOIN pdetails x ON p.pid = x.pid  AND x.isactive = 1
  JOIN (SELECT y.*, RANK() OVER(PARTITION BY ids, idc ORDER BY id) rn FROM plocation y) y
            ON y.idr = p.idr
           AND y.idc = p.idc
           AND y.idopstype in (36, 37)
           AND y.rn = 1
   JOIN (SELECT y.*, RANK() OVER(PARTITION BY ids, idc ORDER BY id) rn FROM plocation_log yt) yt
            ON yt.idr = y.idr
           AND yt.idc= y.idc
           AND yt.idopstype in (36,37)
           AND yt.rn = 1
WHERE
   p.idp IN (10,20,30)
   AND p.idopstype in (36,37)
   AND p.rn = 1
   

【讨论】:

    【解决方案2】:

    考虑加入聚合 CTE 以计算每个组的 MAX一次,而不是为每个外部查询行按行计算 MAX。此外,请务必使用信息更丰富的表别名,而不是 a, b, cx, y, z 样式。

    WITH loc_max AS
      (select ids, idc, max(id) as max_id from plocation group ids, idc)    
     ,   log_max AS    
      (select ids, idc, max(id) as max_id from plocation_log group by ids, idc)
     ,   pat_max AS
      (select ids, idc, max(id) as max_id from patent pr group by ids, idc)
    
    SELECT to_char(pat.pid) AS patentid
           , pat.name
           , det.dept
           , loc.location
      FROM patent pat
      JOIN pdetails det
        ON pat.pid  = det.pid  
        AND det.isactive = 1
      JOIN plocation loc
        ON  loc.idr = pat.idr
        AND loc.idc = pat.idc
        AND loc.idopstype IN (36, 37)
      JOIN loc_max                              -- ADDED CTE JOIN
        ON  loc.id  = loc_max.max_id
        AND loc.ids = loc_max.ids 
        AND loc.idc = loc_max.idc
       
      JOIN plocation_log log
        ON  log.idr = log.idr
        AND log.idc = log.idc
        AND log.idopstype in (36,37)
      JOIN log_max                              -- ADDED CTE JOIN
        ON  log.id  = log_max.max_id
        AND log.ids = log_max.ids
        AND log.idc = log_max.idc
    
      JOIN pat_max                              -- ADDED CTE JOIN
        ON  pat.id  = pat_max.max_id
        AND pat.ids = pat_max.ids 
        AND pat.idc = pat_max.idc
    
    WHERE pat.idp IN (10, 20, 30)
      AND pat.idopstype IN (36, 37)
    

    【讨论】:

    • 我喜欢这种方法。显然,解释计划显示的成本比我之前的查询要高一些。不知道为什么,但确实你的方法是有道理的。
    • 考虑索引JOIN 列。不管由于 CTE 和新的JOINs 可以显示更多步骤的解释计划,查询总体运行速度是否较慢?
    • 是的。它确实很慢。我已经结束了以不同的方式重写查询并且似乎工作。感谢所有的帮助。