【问题标题】:Finding a 'run' of rows from an ordered result set从有序结果集中查找“运行”行
【发布时间】:2013-09-19 07:52:51
【问题描述】:

我正在尝试找出一种方法来识别满足某些条件的结果“运行”(按顺序连续行)。目前,我正在订购一个结果集,并通过眼睛扫描特定模式。这是一个例子:

SELECT the_date, name
FROM orders
WHERE 
    the_date BETWEEN 
        to_date('2013-09-18',..) AND 
        to_date('2013-09-22', ..)
ORDER BY the_date

--------------------------------------
the_date            | name
--------------------------------------
2013-09-18 00:00:01 | John
--------------------------------------
2013-09-19 00:00:01 | James
--------------------------------------
2013-09-20 00:00:01 | John
--------------------------------------
2013-09-20 00:00:02 | John
--------------------------------------
2013-09-20 00:00:03 | John
--------------------------------------
2013-09-20 00:00:04 | John
--------------------------------------
2013-09-21 16:00:01 | Jennifer
--------------------------------------

我想从这个结果集中提取出在2013-09-20 上归属于John 的所有行。一般来说,我正在寻找的是来自同一 name 的一系列结果,连续 >= 3。我使用的是 Oracle 11,但我很想知道这是否可以通过严格的 SQL 来实现,或者如果必须使用某种分析函数。

【问题讨论】:

  • 能否请您发布预期的输出?跑是什么意思?
  • @realspirituals 我解释了预期的输出以及结果“运行”的含义。不清楚吗?
  • 查看我的帖子并确认您要查找的内容...

标签: sql oracle gaps-and-islands


【解决方案1】:

你需要多个嵌套的窗口函数:

SELECT *
FROM
 (
   SELECT the_date, name, grp,
      COUNT(*) OVER (PARTITION BY grp) AS cnt
   FROM
    (
      SELECT the_date, name, 
         SUM(flag) OVER (ORDER BY the_date) AS grp
      FROM
       (
         SELECT the_date, name, 
            CASE WHEN LAG(name) OVER (ORDER BY the_date) = name THEN 0 ELSE 1 END AS flag
         FROM orders
         WHERE 
             the_date BETWEEN 
                 TO_DATE('2013-09-18',..) AND 
                 TO_DATE('2013-09-22', ..)
       ) dt
    ) dt
 ) dt
WHERE cnt >= 3
ORDER BY the_date

【讨论】:

  • 非常酷的解决方案。不知道 LAG 功能。在接下来的几天里,我也会花一些时间真正熟悉OVER
【解决方案2】:

试试这个

WITH ORDERS
    AS (SELECT
             TO_DATE ( '2013-09-18 00:00:01',
                     'YYYY-MM-DD HH24:MI:SS' )
                 AS THE_DATE,
             'John' AS NAME
        FROM
             DUAL
        UNION ALL
        SELECT
             TO_DATE ( '2013-09-19 00:00:01',
                     'YYYY-MM-DD HH24:MI:SS' )
                 AS THE_DATE,
             'James' AS NAME
        FROM
             DUAL
        UNION ALL
        SELECT
             TO_DATE ( '2013-09-20 00:00:01',
                     'YYYY-MM-DD HH24:MI:SS' )
                 AS THE_DATE,
             'John' AS NAME
        FROM
             DUAL
        UNION ALL
        SELECT
             TO_DATE ( '2013-09-20 00:00:02',
                     'YYYY-MM-DD HH24:MI:SS' )
                 AS THE_DATE,
             'John' AS NAME
        FROM
             DUAL
        UNION ALL
        SELECT
             TO_DATE ( '2013-09-20 00:00:03',
                     'YYYY-MM-DD HH24:MI:SS' )
                 AS THE_DATE,
             'John' AS NAME
        FROM
             DUAL
        UNION ALL
        SELECT
             TO_DATE ( '2013-09-20 00:00:04',
                     'YYYY-MM-DD HH24:MI:SS' )
                 AS THE_DATE,
             'John' AS NAME
        FROM
             DUAL
        UNION ALL
        SELECT
             TO_DATE ( '2013-09-21 16:00:01',
                     'YYYY-MM-DD HH24:MI:SS' )
                 AS THE_DATE,
             'Jennifer' AS NAME
        FROM
             DUAL)
SELECT
      B.*
FROM
      (SELECT
            TRUNC ( THE_DATE ) THE_DATE,
            NAME,
            COUNT ( * )
       FROM
            ORDERS
       WHERE
            THE_DATE BETWEEN TRUNC ( TO_DATE ( '2013-09-18',
                                        'YYYY-MM-DD' ) )
                       AND TRUNC ( TO_DATE ( '2013-09-22',
                                        'YYYY-MM-DD' ) )
       GROUP BY
            TRUNC ( THE_DATE ),
            NAME
       HAVING
            COUNT ( * ) >= 3) A,
      ORDERS B
WHERE
      A.NAME = B.NAME
      AND TRUNC ( A.THE_DATE ) = TRUNC ( B.THE_DATE );

输出

9/20/2013 12:00:01 AM   John
9/20/2013 12:00:02 AM   John
9/20/2013 12:00:03 AM   John
9/20/2013 12:00:04 AM   John

【讨论】:

  • 抱歉,我在定义“运行”时不能正确解释自己,而且我没有提供足够好的数据样本。我正在寻找的是当 3 行或更多行由同一个人按顺序出现时。如果 Jane 同一天有 4 行,但它们不按顺序排列,则不应返回她的数据。不过我会更新问题。
猜你喜欢
  • 1970-01-01
  • 2011-10-04
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2011-04-08
  • 2019-03-22
  • 2020-05-13
  • 2021-04-04
相关资源
最近更新 更多