【问题标题】:Select N rows in aggregate functions SQL Server在聚合函数 SQL Server 中选择 N 行
【发布时间】:2018-05-12 00:54:42
【问题描述】:

我有一张如下所示的表格:

+--------+----------+--------+------------+-------+
|   ID   | CHANNEL  | VENDOR | num_PERIOD | SALES |
+--------+----------+--------+------------+-------+
| 000001 | Business | Shop   |          1 | 40    |
| 000001 | Business | Shop   |          2 | 60    |
| 000001 | Business | Shop   |          3 | NULL  |
+--------+----------+--------+------------+-------+

随着时间的推移,IDCHANNELVENDORsales 记录的组合有很多 (num_PERIOD)。

想法是获取一个新列,该列返回SALES 列中的NULLS 数,但根据num_PERIOD 列在前111 个寄存器中。

我一直在尝试这样的事情:

SELECT ID,
       CHANNEL,
       VENDOR,
       sum(CASE
               WHEN SALES IS NULL THEN 1
               ELSE 0
           END) OVER (PARTITION BY ID,
                                   CHANNEL,
                                   VENDOR
                      ORDER BY num_PERIOD ROWS BETWEEN UNBOUNDED PRECEDING AND 111 FOLLOWING) AS NULL_SALES_SET
FROM TABLE
GROUP BY ID,
         CHANNEL,
         VENDOR

但我没有得到我正在寻找的东西。

所以要得到一个类似的表:

+--------+--------------+--------+----------------+
|   ID   |   CHANNEL    | VENDOR | NULL_SALES_SET |
+--------+--------------+--------+----------------+
| 000001 | Business     | Shop   |              1 |
| 000002 | Business     | Market |              0 |
| 000002 | Non Business | Shop   |              3 |
+--------+--------------+--------+----------------+

在选择IDCHANNELVENDORnum_PERIOD 排序的前 111 行时遇到困难。

【问题讨论】:

    标签: sql sql-server window-functions partition


    【解决方案1】:

    将 CTE(公用表表达式)与 ROW_NUMBER 窗口函数一起使用,您应该进行设置:

    ;WITH MyCTE AS
    (
        SELECT
            id,
            channel,
            vendor,
            sales,
            ROW_NUMBER() OVER (PARTITION BY id, channel, vendor ORDER BY num_period) AS row_num
        FROM
            MyTable
    )
    SELECT
        id,
        channel,
        vendor,
        SUM(CASE WHEN sales IS NULL THEN 1 ELSE 0 END) AS null_sales_set
    FROM
        MyCTE
    WHERE
        row_num <= 111
    GROUP BY
        id, channel, vendor
    

    【讨论】:

    • 我喜欢 CTE 的使用——它将表的使用清理为单个实例。
    【解决方案2】:

    一定要用开窗功能吗?

    SELECT ID
         , CHANNEL
         , VENDOR
         , NULL_SALES_SET = SUM(CASE WHEN SALES IS NULL THEN 1 ELSE 0 END)
      FROM Table
     WHERE num_PERIOD <= 111
     GROUP BY ID, CHANNEL, VENDOR
    

    或者您是否正在寻找前 111 个 num_PERIOD 值,以允许在 num_PERIOD 列中存在间隙?

    SELECT t.ID
         , t.CHANNEL
         , t.VENDOR
         , NULL_SALES_SET = SUM(CASE WHEN t.SALES IS NULL THEN 1 ELSE 0 END)
      FROM Table t
            INNER JOIN ( SELECT i.ID
                              , i.CHANNEL
                              , i.VENDOR
                              , i.num_PERIOD
                              , rowNum = ROW_NUMBER(PARTITION BY i.ID, i.CHANNEL, i.VENDOR ORDER BY i.num_PERIOD)
                           FROM Table i ) l
              ON t.ID = l.ID
             AND t.CHANNEL = l.CHANNEL
             AND t.VENDOR = l.VENDOR
             AND t.num_PERIOD = l.num_PERIOD
     WHERE l.rowNum <= 111
     GROUP BY ID, CHANNEL, VENDOR
    

    编辑:不确定我是如何忽略它的,但有必要在 num_PERIOD 列上加入。

    编辑:在不影响 NULL_SALES_SET 的情况下,为每个 ID、频道、供应商添加不同 num_PERIOD 的数量

    SELECT t.ID
         , t.CHANNEL
         , t.VENDOR
           -- Counts the NULL Sales when the num_PERIOD is in the 
           -- first 111 num_PERIODs
         , NULL_SALES_SET = SUM(CASE WHEN l.rowNum IS NOT NULL AND t.SALES IS NULL 
                                       THEN 1 
                                     ELSE 0 END)
           -- Counts the distinct num_PERIOD values
         , PERIOD_COUNT = COUNT(DISTINCT t.num_PERIOD)
      FROM Table t
            LEFT OUTER JOIN ( SELECT i.ID
                                   , i.CHANNEL
                                   , i.VENDOR
                                   , i.num_PERIOD
                                   , rowNum = ROW_NUMBER(PARTITION BY i.ID,
                                                                      i.CHANNEL,
                                                                      i.VENDOR
                                                         ORDER BY i.num_PERIOD)
                              FROM Table i ) l
              ON t.ID = l.ID
             AND t.CHANNEL = l.CHANNEL
             AND t.VENDOR = l.VENDOR
             AND t.num_PERIOD = l.num_PERIOD
             AND l.rowNum <= 111
     GROUP BY ID, CHANNEL, VENDOR
    

    【讨论】:

    • 这对于问题的主要目的非常有效但是,我想知道过滤器 WHERE l.rowNum
    • @Also - 我没有关注。您是否在询问如何为给定的 Id、Channel、Vendor 添加不同的 num_PERIOD 数量?
    • 对不起我的评论格式。是的,我想知道如何在不考虑 where 子句中的过滤器的情况下继续进行其他计算。一个例子是计算给定 ID、CHANNEL、VENDER 的 num_PERIOD 的总区别(但不限于前 111 个周期)谢谢!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2013-08-14
    • 1970-01-01
    • 1970-01-01
    • 2020-08-05
    • 2020-05-09
    • 1970-01-01
    • 2019-05-30
    相关资源
    最近更新 更多