【问题标题】:Group by on column that repeats按重复的列分组
【发布时间】:2018-04-24 20:24:58
【问题描述】:

我无法用语言表达这个问题,这可能是我找不到示例的原因,所以这就是我想做的事情。

我有一张这样的桌子

    | counter|      timestamp      |
    |   1    | 2018-01-01T11:11:01 |
    |   1    | 2018-01-01T11:11:02 |
    |   1    | 2018-01-01T11:11:03 |
    |   2    | 2018-01-01T11:11:04 |
    |   2    | 2018-01-01T11:11:05 |
    |   3    | 2018-01-01T11:11:06 |
    |   3    | 2018-01-01T11:11:07 |
    |   1    | 2018-01-01T11:11:08 |
    |   1    | 2018-01-01T11:11:09 |
    |   1    | 2018-01-01T11:11:10 |

我想做的是按每组计数器分组,所以如果我进行类似的查询

SELECT counter, MAX(timestamp) as st, MIN(timestamp) as et 
FROM table 
GROUP BY counter;

结果是

    | counter |          st         |         et          |
    |   1     | 2018-01-01T11:11:01 | 2018-01-01T11:11:03 |
    |   2     | 2018-01-01T11:11:04 | 2018-01-01T11:11:05 |
    |   3     | 2018-01-01T11:11:06 | 2018-01-01T11:11:07 |
    |   1     | 2018-01-01T11:11:08 | 2018-01-01T11:11:10 |

而不是实际发生的事情是

    | counter |          st         |         et          |
    |   1     | 2018-01-01T11:11:01 | 2018-01-01T11:11:10 |
    |   2     | 2018-01-01T11:11:04 | 2018-01-01T11:11:05 |
    |   3     | 2018-01-01T11:11:06 | 2018-01-01T11:11:07 |

所以我想要一些在没有嵌套查询的情况下理想地组合分组和分区的东西

【问题讨论】:

  • 这里没有问题。
  • 更新以澄清@STLDeveloper

标签: sql postgresql


【解决方案1】:

您必须指定具有相同计数器重复值的组。这可以使用两个窗口函数lag() 和累积sum() 来完成:

select counter, min(timestamp) as st, max(timestamp) as et
from (
    select counter, timestamp, sum(grp) over w as grp
    from (
        select *, (lag(counter, 1, 0) over w <> counter)::int as grp
        from my_table
        window w as (order by timestamp)
        ) s
    window w as (order by timestamp)
    ) s
group by counter, grp
order by st

DbFiddle.

【讨论】:

    【解决方案2】:

    你应该计算一个新的组:

    create table tbl(counter int, ts timestamp);
    insert into tbl values
        (1, '2018-01-01T11:11:01'),
        (1, '2018-01-01T11:11:02'),
        (1, '2018-01-01T11:11:03'),
        (2, '2018-01-01T11:11:04'),
        (2, '2018-01-01T11:11:05'),
        (3, '2018-01-01T11:11:06'),
        (3, '2018-01-01T11:11:07'),
        (1, '2018-01-01T11:11:08'),
        (1, '2018-01-01T11:11:09'),
        (1, '2018-01-01T11:11:10');
    
    ✓ 10 行受影响
    select min(counter) as counter, min(ts) as st, max(ts) as et
    from
    (
        select counter, ts, sum(rst) over (order by ts) as grp
        from 
             (
             select counter, ts,
                    case when coalesce(lag(counter) over (order by ts), -1) <> counter then 1 end rst
             from   tbl
             ) t1
    ) t2
    group by grp
    
    柜台 |圣 |等 ------: | :----------------- | :----------------- 3 | 2018-01-01 11:11:06 | 2018-01-01 11:11:07 1 | 2018-01-01 11:11:08 | 2018-01-01 11:11:10 2 | 2018-01-01 11:11:04 | 2018-01-01 11:11:05 1 | 2018-01-01 11:11:01 | 2018-01-01 11:11:03

    db小提琴here

    【讨论】:

      【解决方案3】:

      您可以使用排名功能

      select counter, min(timestamp) st, max(timestamp) et
      from (select *, 
                     row_number() over (order by timestamp) Seq1,
                     row_number() over (partition by counter order by timestamp) Seq2 
            from table 
           ) t
      group by counter, (Seq1-Seq2);
      

      这将使用两个排名函数(Seq1-Seq2)的差异并在GROUP BY 子句中使用它们。

      【讨论】:

        猜你喜欢
        • 2014-08-06
        • 2013-09-18
        • 2012-01-23
        • 1970-01-01
        • 1970-01-01
        • 2019-02-16
        • 1970-01-01
        • 2016-11-12
        相关资源
        最近更新 更多