【问题标题】:SQL (NOT IN) Query takes forever to executeSQL (NOT IN) 查询需要永远执行
【发布时间】:2013-11-08 12:47:48
【问题描述】:

我有一个查询需要超过 2 小时才能执行数百万条记录,我不确定如何优化它以使其运行速度更快。 Here 是表和查询的 sql fiddle。

WITH frm 
     AS (SELECT product_id                               AS PId, 
                Min(Cast(product_startdate AS DATETIME)) AS PStartDate 
         FROM   products 
         WHERE  product_status IN ( 'F', 'R', 'M' ) 
         GROUP  BY product_id), 
     firstcount 
     AS (SELECT pid, 
                pstartdate, 
                (SELECT Count(*) 
                 FROM   products 
                 WHERE  product_id IN ((SELECT product_id 
                                        FROM   products 
                                        WHERE  product_status IN ( 'OR', 'OP' ) 
                                               AND product_comments LIKE 
                                                   '%CANCELLED%' 
                                        EXCEPT 
                                        (SELECT product_id 
                                         FROM   products 
                                         WHERE  product_status = 'DE' 
                                         UNION 
                                         SELECT product_id 
                                         FROM   products 
                                         WHERE  product_status = 'OR' 
                                                AND product_comments NOT LIKE 
                                                    '%CANCELLED%')) 
                                       EXCEPT 
                                       (SELECT product_id 
                                        FROM   products 
                                        WHERE  product_status IN ( 
                                               'RE', 'C', 'S', 'D' ) 
                                       )) 
                        AND product_id = pid) AS v_count 
         FROM   frm), 
     secondcount 
     AS (SELECT pid, 
                pstartdate, 
                CASE 
                  WHEN v_count = 0 THEN (SELECT Count(*) 
                                         FROM   products 
                                         WHERE  product_id IN 
                  ( 
                  SELECT product_id 
                  FROM   [dbo].products 
                  WHERE 
                                                product_status IN ( 'F', 
                                                'R', 'M' ) 
                                                AND product_startdate != 
                                                    '.' 
                                                               EXCEPT 
                                                               (SELECT 
                  product_id 
                                                                FROM   products 
                                                                WHERE 
                                                product_status = 'DE' 
                                                                UNION 
                                                                SELECT 
                  product_id 
                                                                FROM   products 
                                                                WHERE 
                                                product_status = 'OR' 
                                                AND product_comments NOT 
                                                    LIKE '%CANCELLED%') 
                                                               EXCEPT 
                                                               (SELECT 
                  product_id 
                                                                FROM   products 
                                                                WHERE 
                                                product_status IN ( 'OR', 
                                                'OP' ) 
                                                AND product_comments LIKE 
                                                    '%CANCELLED%' 
                                                                EXCEPT 
                                                                (SELECT 
                  product_id 
                                                                 FROM   products 
                                                                 WHERE 
                                                 product_status = 
                                                 'DE' 
                                                                 UNION 
                                                SELECT product_id 
                                                FROM   products 
                                                WHERE  product_status = 
                                                       'OR' 
                                                       AND product_comments NOT 
                                                           LIKE '%CANCELLED%')) 
                                                               EXCEPT 
                                                               (SELECT 
                  product_id 
                                                                FROM   products 
                                                                WHERE 
                                                product_status IN ( 'RE', 
                                                'C', 'S', 'D' ))) 
                                                AND product_id = pid) 
                  ELSE v_count 
                END AS v_count 
         FROM   firstcount) 
INSERT INTO products_del 
            (product_id, 
             product_startdate, 
             productdel_status) 
SELECT pid, 
       pstartdate, 
       CASE 
         WHEN v_count != 0 THEN 'UNKNOWN' 
         ELSE NULL 
       END 
FROM   secondcount 

SELECT * 
FROM   products_del 

【问题讨论】:

  • 在那个表上放一些索引 - 看看执行计划,它是表扫描后的表扫描。
  • 如果您解释您的表结构并解释您想要实现的目标,这可能会有所帮助。这取决于您,让我们了解您的目标,而无需对您的查询进行逆向工程。
  • 通用表表达式非常好用,但它们会减慢复杂查询的速度——尤其是大型查询。尝试将查询选择到临时表(或表变量)中,而不是嵌套公用表表达式,看看是否有帮助。
  • 即使表是凌乱的或临时的,索引也应该加快查询速度。如果花费两个小时,构建它们的时间可能是延长查询时间的一个很好的权衡。
  • 考虑在您的例外中添加连接提示,以指导查询计划,如第一个答案中所述,在此 SO 链接上:stackoverflow.com/questions/16084350/sql-except-performance/…

标签: sql sql-server query-optimization


【解决方案1】:

请尝试以下操作。我通过消除不必要的INUNIONEXCEPT 子句来简化内部查询。

WITH frm 
     AS (SELECT product_id                               AS PId, 
                Min(Cast(product_startdate AS DATETIME)) AS PStartDate 
         FROM   products 
         WHERE  product_status IN ( 'F', 'R', 'M' ) 
         GROUP  BY product_id), 
     firstcount 
     AS (SELECT pid, 
                pstartdate, 
                (SELECT Count(*) 
                 FROM   products 
                 WHERE  product_status IN ( 'OR', 'OP' ) 
                   AND product_comments LIKE '%CANCELLED%' 
                   AND product_id = pid) AS v_count 
         FROM   frm), 
     secondcount 
     AS (SELECT pid, 
                pstartdate, 
                CASE 
                  WHEN v_count = 0 THEN (SELECT Count(*) 
                                         FROM   products 
                                         WHERE  product_status IN ( 'F', 'R', 'M' ) 
                                                AND product_startdate != '.' 
                                                AND product_id = pid) 
                  ELSE v_count 
                END AS v_count 
         FROM   firstcount) 
INSERT INTO products_del 
            (product_id, 
             product_startdate, 
             productdel_status) 
SELECT pid, 
       pstartdate, 
       CASE 
         WHEN v_count != 0 THEN 'UNKNOWN' 
         ELSE NULL 
       END 
FROM   secondcount 

SELECT * 
FROM   products_del 

【讨论】:

  • 如果该表每个产品只有一行(products 表的典型情况),这似乎可行。但是,如果此表允许每个产品多行,那么您是否不需要以某种方式为排除项编写代码,例如 WHERE product_status = 'DE'WHERE product_status IN ('RE', 'C', 'S', 'D' )
  • ProductId 在这里不是唯一的,所以我们需要检查所有状态,包括'DE'和('RE','C','S','D')sqlfiddle.com/#!6/b14eb/56跨度>
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2010-11-12
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多