【问题标题】:How to group by value to identify min and max date, but in specific order?如何按值分组以识别最小和最大日期,但按特定顺序?
【发布时间】:2021-11-14 05:14:57
【问题描述】:

我试图在这里找到一些东西,但我没有找到我的用例。我希望你能帮助我。 起初我的桌子是可用的:

STATION_NUMBER PART_NO BOOK_DATE
11111 A 2021-08-01 6:00:00
11111 A 2021-08-01 6:05:00
11111 A 2021-08-01 6:07:00
11111 A 2021-08-01 6:08:00
11111 B 2021-08-01 7:10:00
11111 B 2021-08-01 7:13:00
11111 B 2021-08-01 7:15:00
11111 B 2021-08-01 7:25:00
11111 A 2021-08-01 8:10:00
11111 A 2021-08-01 8:12:00
11111 A 2021-08-01 8:16:00
11111 A 2021-08-01 8:19:00
22222 A 2021-08-01 6:00:00
22222 A 2021-08-01 6:05:00
22222 A 2021-08-01 6:07:00
22222 A 2021-08-01 6:08:00
22222 B 2021-08-01 7:10:00
22222 B 2021-08-01 7:13:00
22222 B 2021-08-01 7:15:00
22222 B 2021-08-01 7:25:00
22222 A 2021-08-01 8:10:00
22222 A 2021-08-01 8:12:00
22222 A 2021-08-01 8:16:00
22222 A 2021-08-01 8:19:00

我想要得到的结果如下:

STATION_NUMBER PART_NO START_BOOK_DATE END_BOOK_DATE
11111 A 2021-08-01 6:00:00 2021-08-01 6:08:00
11111 B 2021-08-01 7:10:00 2021-08-01 7:25:00
11111 A 2021-08-01 8:10:00 2021-08-01 8:19:00
22222 A 2021-08-01 6:00:00 2021-08-01 6:08:00
22222 B 2021-08-01 7:10:00 2021-08-01 7:25:00
22222 A 2021-08-01 8:10:00 2021-08-01 8:19:00

我试图用这个查询来解决它,但我没有达到我的预期

SELECT PART_NO,
      STATION_NUMBER,
      GROUP_NUMBER,
      MIN(BOOK_DATE) START_BOOK_DATE,
      MAX(BOOK_DATE) END_BOOK_DATE
FROM(
    SELECT PART_NO,
           STATION_NUMBER,
           BOOK_DATE,
           IS_CHANGED,
           RANK() OVER (ORDER BY PART_NO,IS_CHANGED) GROUP_NUMBER
    FROM(
        SELECT PART_NO,
        STATION_NUMBER,
        BOOK_DATE,
        CASE 
            WHEN NOT LEAD(PART_NO, 1) OVER (ORDER BY BOOK_DATE) = PART_NO
            THEN ROWNUM
            ELSE 0
        END IS_CHANGED
        FROM PROD_DATA
        WHERE STATION_NUMBER in ('11111','22222')
        AND BOOK_DATE BETWEEN TO_TIMESTAMP('01.08.2021 05:00:00', 'DD.MM.YYYY HH24:MI:SS') and TO_TIMESTAMP('01.08.2021 12:00:00', 'DD.MM.YYYY HH24:MI:SS')
        ORDER BY BOOK_DATE
    )ORDER BY BOOK_DATE
) GROUP BY STATION_NUMBER, PART_NO, GROUP_NUMBER

我必须按 STATION_NUMBER 和 PART_NUMBER 分组,但我需要从时间顺序的角度来看第一个和最后一个 BOOK_DATE。 PART_NUMBER 和/或 STATION_NUMBER 更改是我计算新行的触发器。

【问题讨论】:

  • 感谢您的提示。我会更新的
  • 已更新。首先为我尝试:)

标签: sql oracle gaps-and-islands


【解决方案1】:

从 Oracle 12 开始,这就是 MATCH_RECOGNIZE 的用途:

SELECT *
FROM   prod_date
MATCH_RECOGNIZE(
  PARTITION BY station_number
  ORDER     BY book_date
  MEASURES
    FIRST(part_no) AS part_no,
    FIRST(book_date) AS start_book_date,
    LAST(book_date) AS end_book_date
  ONE ROW PER MATCH
  PATTERN (same_part+)
  DEFINE
    same_part AS FIRST(part_no) = part_no
)

其中,对于样本数据:

CREATE TABLE prod_date (STATION_NUMBER, PART_NO, BOOK_DATE) AS
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '6:00:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '6:05:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '6:07:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '6:08:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'B',    DATE '2021-08-01' + INTERVAL '7:10:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'B',    DATE '2021-08-01' + INTERVAL '7:13:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'B',    DATE '2021-08-01' + INTERVAL '7:15:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'B',    DATE '2021-08-01' + INTERVAL '7:25:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '8:10:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '8:12:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '8:16:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '8:19:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '6:00:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '6:05:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '6:07:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '6:08:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'B',    DATE '2021-08-01' + INTERVAL '7:10:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'B',    DATE '2021-08-01' + INTERVAL '7:13:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'B',    DATE '2021-08-01' + INTERVAL '7:15:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'B',    DATE '2021-08-01' + INTERVAL '7:25:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '8:10:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '8:12:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '8:16:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '8:19:00' HOUR TO SECOND FROM DUAL;

输出:

STATION_NUMBER PART_NO START_BOOK_DATE END_BOOK_DATE
11111 A 2021-08-01 06:00:00 2021-08-01 06:08:00
11111 B 2021-08-01 07:10:00 2021-08-01 07:25:00
11111 A 2021-08-01 08:10:00 2021-08-01 08:19:00
22222 A 2021-08-01 06:00:00 2021-08-01 06:08:00
22222 B 2021-08-01 07:10:00 2021-08-01 07:25:00
22222 A 2021-08-01 08:10:00 2021-08-01 08:19:00

db小提琴here

【讨论】:

  • 嗨@MT0,谢谢你的dbfiddle。这非常有帮助。
    我有Oracle 11g。在我的下一篇文章中,我知道我可以改进什么:)
  • @edding - 显而易见,您(和所有其他发帖人)可以改进的一件事就是始终说明您的 Oracle 版本。在这里,您可以看到一个完美的例子,说明为什么它如此重要。
【解决方案2】:

您的问题属于称为“差距和孤岛问题”的问题类别(如果您想进一步研究,请在 Google 上找到该短语)。

在 Oracle 11 及更早版本中,您可以使用分析函数来获得所需的结果。该方法被称为“tabibitosan 法”或“固定差异法”。

关键步骤首先出现(在下面with 子句中的子查询中):仅按站号分组(分区)计算序号,并分别按站号和零件号分区。在零件编号相同的连续行序列中,差异是恒定的,然后当新的此类序列开始时,差异会跳转到不同的值。然后在外部查询中使用它进行分组。

with 
  prep as (
    select pd.*,
           row_number() over (partition by station_number order by book_date)
         - row_number() over (partition by station_number, part_no
                                  order by book_date) as grp
    from   prod_date pd
  )
select station_number, part_no, min(book_date) as start_book_date,
       max(book_date) as end_book_date
from   prep
group  by station_number, part_no, grp
order  by station_number, start_book_date
;

STATION_NUMBER PART_NO START_BOOK_DATE     END_BOOK_DATE      
-------------- ------- ------------------- -------------------
         11111 A       2021-08-01 06:00:00 2021-08-01 06:08:00
         11111 B       2021-08-01 07:10:00 2021-08-01 07:25:00
         11111 A       2021-08-01 08:10:00 2021-08-01 08:19:00
         22222 A       2021-08-01 06:00:00 2021-08-01 06:08:00
         22222 B       2021-08-01 07:10:00 2021-08-01 07:25:00
         22222 A       2021-08-01 08:10:00 2021-08-01 08:19:00

【讨论】:

    【解决方案3】:
    • 在内联视图 t 中,每个子组的末端行都位于间隙列中。
    • 在内联视图 tt 中,内联视图 t 中间隙列中具有空值的所有行都使用 first_value 分析函数填充。
    • 最后,我按 STATION_NUMBER、PART_NO、GRP 列对内联视图 tt 中的行进行分组,然后使用 min 和 max 聚合函数来获得所需的输出。
    select STATION_NUMBER, PART_NO, min(BOOK_DATE) START_BOOK_DATE, max(BOOK_DATE) END_BOOK_DATE
    from (
      select STATION_NUMBER, PART_NO, BOOK_DATE, GAPS
          , FIRST_VALUE(GAPS ignore nulls) 
              over( partition by STATION_NUMBER order by BOOK_DATE 
                    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) grp
      from (
        select 
          STATION_NUMBER, PART_NO, BOOK_DATE
          , case 
              when PART_NO != lead(PART_NO, 1, '-'||PART_NO)over(partition by STATION_NUMBER order by BOOK_DATE)
                then row_number()over(partition by STATION_NUMBER order by BOOK_DATE)
              else null
            end gaps
        from PROD_DATA
      )t
    )tt
    group by STATION_NUMBER, PART_NO, GRP
    order by STATION_NUMBER, GRP
    ;
    

    demo

    【讨论】:

      猜你喜欢
      • 2015-06-06
      • 1970-01-01
      • 1970-01-01
      • 2013-12-27
      • 2020-10-21
      • 1970-01-01
      • 2012-07-15
      • 1970-01-01
      相关资源
      最近更新 更多