【问题标题】:counting items that match within select statement计算 select 语句中匹配的项目
【发布时间】:2013-09-03 08:50:32
【问题描述】:

我们有一个针对一些相当大的表执行的存储过程,并且在加入一个更大的表时,它还记录有多少记录与相应的 batch_id 匹配。我想弄清楚的是我可以通过计数功能或其他方式来改进它吗?试图摆脱嵌套的 SELECT COUNT(*) 语句。 CCTransactions 表有 140 万行,BatchItems 有 660 万行。

SELECT  a.ItemAuthID, a.FeeAuthID, a.Batch_ID, a.ItemAuthCode, 
        a.FeeAuthCode, b.Amount, b.Fee, 
  (SELECT COUNT(*) FROM BatchItems WHERE Batch_ID = a.Batch_ID) AS BatchCount,
  ItemBillDate, FeeBillDate, b.AccountNumber, 
  b.Itemcode, ItemAuthToken, FeeAuthToken,
  cc.ItemMerchant, cc.FeeMerchant
  FROM CCTransactions a WITH(NOLOCK)
        INNER JOIN BatchItems b WITH(NOLOCK)
              ON a.Batch_ID = b.Batch_ID
        INNER JOIN CCConfig cc WITH(NOLOCK)
              ON a.ClientCode = cc.ClientCode
  WHERE ((ItemAuthCode > '' AND ItemBillDate IS NULL)
              OR (FeeAuthCode > '' AND FeeBillDate IS NULL))
              AND TransactionDate BETWEEN DATEADD(d,-7,GETDATE()) 
              AND convert(char(20),getdate(),101)  + ' ' +   @Cutoff
  ORDER BY TransactionDate

【问题讨论】:

    标签: sql select count large-data


    【解决方案1】:

    当您的 DBMS 支持 WIndowed 聚合函数时,您可以将其重写为

    COUNT(*) OVER (PARTITION BY Batch_ID)
    

    当然,这只会返回 SELECT 返回的每个 Batch_ID 的行数。如果内部连接导致的行数减少,则它不是正确的数字。

    然后将标量子查询重写为连接可能更有效(取决于您的 DBMS):

    SELECT  a.ItemAuthID, a.FeeAuthID, a.Batch_ID, a.ItemAuthCode, 
            a.FeeAuthCode, b.Amount, b.Fee, 
      dt.BatchCount,
      ItemBillDate, FeeBillDate, b.AccountNumber, 
      b.Itemcode, ItemAuthToken, FeeAuthToken,
      cc.ItemMerchant, cc.FeeMerchant
      FROM CCTransactions a WITH(NOLOCK)
            INNER JOIN BatchItems b WITH(NOLOCK)
                  ON a.Batch_ID = b.Batch_ID
            INNER JOIN CCConfig cc WITH(NOLOCK)
                  ON a.ClientCode = cc.ClientCode
            INNER JOIN
              ( 
                SELECT BatchCount, COUNT(*) AS BatchCount
                FROM BatchItems 
                GROUP BY Batch_ID
              ) AS dt ON a.Batch_ID = dt.Batch_ID
      WHERE ((ItemAuthCode > '' AND ItemBillDate IS NULL)
                  OR (FeeAuthCode > '' AND FeeBillDate IS NULL))
                  AND TransactionDate BETWEEN DATEADD(d,-7,GETDATE()) 
                  AND convert(CHAR(20),getdate(),101)  + ' ' +   @Cutoff
      ORDER BY TransactionDate
    

    【讨论】:

    • 我确实测试了 COUNT(*) OVER,虽然它确实减少了 SET STATISTICS IO ON 中的扫描计数,但它增加了计划的子树成本。所以现在让我测试一下子查询的 INNER JOIN,看看它是如何执行的。
    猜你喜欢
    • 2013-04-25
    • 2015-12-22
    • 1970-01-01
    • 2020-09-26
    • 1970-01-01
    • 1970-01-01
    • 2019-08-08
    • 1970-01-01
    • 2021-07-21
    相关资源
    最近更新 更多