【问题标题】:PARTITION BY with and without KEEP in Oracle在 Oracle 中使用和不使用 KEEP 进行分区
【发布时间】:2013-11-22 13:20:47
【问题描述】:

我遇到了两个似乎具有相同结果的查询:在分区上应用聚合函数。

我想知道这两个查询之间是否有任何区别:

SELECT empno,
   deptno,
   sal,
   MIN(sal) OVER (PARTITION BY deptno) "Lowest",
   MAX(sal) OVER (PARTITION BY deptno) "Highest"
FROM empl

SELECT empno,
   deptno,
   sal,
   MIN(sal) KEEP (DENSE_RANK FIRST ORDER BY sal) OVER (PARTITION BY deptno) "Lowest",
   MAX(sal) KEEP (DENSE_RANK LAST ORDER BY sal) OVER (PARTITION BY deptno) "Highest"
FROM empl

第一个版本更合乎逻辑,但第二个版本可能是某种特殊情况,也许是一些性能优化。

【问题讨论】:

    标签: sql oracle


    【解决方案1】:
    MIN(sal) KEEP (DENSE_RANK FIRST ORDER BY sal) OVER (PARTITION BY deptno)
    

    可以(大致)按照从右到左的顺序来考虑该语句:

    • OVER (PARTITION BY deptno) 表示将行划分为不同的 deptno 组;那么
    • ORDER BY sal 表示,对于每个分区,按sal 对行进行排序(隐式使用ASCending order);那么
    • KEEP (DENSE_RANK FIRST 表示对每个分区的有序行进行(连续)排名(对于排序列具有相同值的行将被赋予相同的排名)并丢弃所有未排名第一的行;最后
    • MIN(sal)对于每个分区的剩余行,返回最低工资。

    在这种情况下,MINDENSE_RANK FIRST 都在 sal 列上运行,所以会做同样的事情,KEEP (DENSE_RANK FIRST ORDER BY sal) 是多余的。

    但是,如果您使用不同的列作为最小值,那么您可以看到效果:

    SQL Fiddle

    Oracle 11g R2 架构设置

    CREATE TABLE test (name, sal, deptno) AS
    SELECT 'a', 1, 1 FROM DUAL
    UNION ALL SELECT 'b', 1, 1 FROM DUAL
    UNION ALL SELECT 'c', 1, 1 FROM DUAL
    UNION ALL SELECT 'd', 2, 1 FROM DUAL
    UNION ALL SELECT 'e', 3, 1 FROM DUAL
    UNION ALL SELECT 'f', 3, 1 FROM DUAL
    UNION ALL SELECT 'g', 4, 2 FROM DUAL
    UNION ALL SELECT 'h', 4, 2 FROM DUAL
    UNION ALL SELECT 'i', 5, 2 FROM DUAL
    UNION ALL SELECT 'j', 5, 2 FROM DUAL;
    

    查询 1

    SELECT DISTINCT
      MIN(sal) KEEP (DENSE_RANK FIRST ORDER BY sal) OVER (PARTITION BY deptno) AS min_sal_first_sal,
      MAX(sal) KEEP (DENSE_RANK FIRST ORDER BY sal) OVER (PARTITION BY deptno) AS max_sal_first_sal,
      MIN(name) KEEP (DENSE_RANK FIRST ORDER BY sal) OVER (PARTITION BY deptno) AS min_name_first_sal,
      MAX(name) KEEP (DENSE_RANK FIRST ORDER BY sal) OVER (PARTITION BY deptno) AS max_name_first_sal,
      MIN(name) KEEP (DENSE_RANK LAST ORDER BY sal) OVER (PARTITION BY deptno) AS min_name_last_sal,
      MAX(name) KEEP (DENSE_RANK LAST ORDER BY sal) OVER (PARTITION BY deptno) AS max_name_last_sal,
      deptno
    FROM test
    

    Results

    | MIN_SAL_FIRST_SAL | MAX_SAL_FIRST_SAL | MIN_NAME_FIRST_SAL | MAX_NAME_FIRST_SAL | MIN_NAME_LAST_SAL | MAX_NAME_LAST_SAL | DEPTNO |
    |-------------------|-------------------|--------------------|--------------------|-------------------|-------------------|--------|
    |                 1 |                 1 |                  a |                  c |                 e |                 f |      1 |
    |                 4 |                 4 |                  g |                  h |                 i |                 j |      2 |
    

    【讨论】:

      【解决方案2】:

      在您的示例中,没有区别,因为您的聚合位于您正在排序的同一列上。 “KEEP”的真正意义/力量在于您对不同列进行聚合和排序。例如(借用另一个答案的“测试”表)...

      SELECT deptno,  min(name) keep ( dense_rank first order by sal desc, name  ) ,
      max(sal)
      FROM test
      group by deptno
      

      ;

      此查询获取每个部门中薪水最高的人的姓名。考虑没有“KEEP”子句的替代方案:

      SELECT deptno, name, sal
      FROM test t
      WHERE not exists ( SELECT 'person with higher salary in same department'
                                                  FROM test t2  
                                                  WHERE t2.deptno = t.deptno
                                                  and ((  t2.sal > t.sal )
                                                  OR ( t2.sal = t.sal AND t2.name < t.name ) ) )
      

      KEEP 子句更简单、更高效(在这个简单的示例中,只有 3 个一致的 get 与 34 个备选方案)。

      【讨论】:

      • 是否在任何地方记录了 KEEP? @matthew-mcpeak
      【解决方案3】:

      详细说明@MT0 的回答中提到的一个区别: 在您的第一个查询中,聚合函数 MIN 和 MAX 正在完成这项工作, 而在第二个中,实际的行由 FIRST、LAST 和 KEEP 选择。

      您甚至可以在第二个示例中将 MAX 替换为 MIN,它仍然会给出正确的答案(最高薪水)。

      更多信息请参考以下 article

      【讨论】:

        【解决方案4】:

        如果您基于两列排序并获取其中一列或两列,这也很有帮助。

        CREATE TABLE test (name, sal, deptno) AS
        SELECT 'adam', 100, 1 FROM DUAL
        UNION ALL SELECT 'bravo', 500, 1 FROM DUAL
        UNION ALL SELECT 'coy', 456, 1 FROM DUAL
        UNION ALL SELECT 'david', 50, 1 FROM DUAL
        UNION ALL SELECT 'ethan', 50, 1 FROM DUAL
        UNION ALL SELECT 'feral', 300, 1 FROM DUAL;
        

        现在您要选择薪水最低的员工以及该人的薪水。条件是如果两个员工的最低薪水相同,则获取其姓名按字母顺序排在第一位的员工。

          select o.deptno
        ,min(o.sal) keep 
          (dense_rank first order by o.sal, o.name) least_salary
        ,min(o.name) keep 
          (dense_rank first order by o.sal, o.name) least_salary_person
         from test o
          group by 
         o.deptno;
        

        输出:

        DEPTNO  LEAST_SALARY    LEAST_SALARY_PERSON
        1        50             david
        

        【讨论】:

          【解决方案5】:

          此查询获取每个部门中薪水最高的人的姓名。

          select MIN(ename),sal,deptno
          from emp where sal in
             (
              select max(sal) from emp group by deptno
             )
          GROUP BY sal,deptno;
          

          【讨论】:

            猜你喜欢
            • 2013-08-31
            • 2019-01-10
            • 1970-01-01
            • 1970-01-01
            • 2013-07-31
            • 2011-01-18
            • 2016-10-08
            • 1970-01-01
            • 2010-09-19
            相关资源
            最近更新 更多