【问题标题】:Group consecutive rows based on one column基于一列对连续行进行分组
【发布时间】:2019-04-12 14:42:20
【问题描述】:

假设我从select * from journeys 的结果中得到这张表:

timestamp     | inJourney (1 = true and 0 = false)
--------------------------------------------------
time1         | 1
time2         | 1
time3         | 1
time4         | 0
time5         | 0
time6         | 1
time7         | 1
time8         | 1

预期:

timestamp     | inJourney (1 = true and 0 = false)
--------------------------------------------------
time1         | 1
time4         | 0
time8         | 1

注意:时间戳并不重要,因为我只想计算行程次数。

知道我要做什么吗?

【问题讨论】:

标签: sql postgresql gaps-and-islands


【解决方案1】:

这是一个孤岛问题。使用row_number()的区别:

select injourney, min(timestamp), max(timestamp)
from (select t.*,
             row_number() over (order by timestamp) as seqnum,
             row_number() over (partition by injourney, order by timestamp) as seqnum_i
      from t
     ) t
group by injourney, (seqnum - seqnum_i)
order by min(timestamp);

【讨论】:

  • 非常感谢。我不知道这个问题(缝隙和岛屿问题),但现在我知道了。我已经在这几个小时了。你值得拥有一个美妙的周末。 @GordonLinoff
【解决方案2】:

这是一个gaps-and-islands问题,你可以尝试使用ROW_NUMBER窗口函数从结果集中获取间隙然后使用MIN

你可以试试这个。

查询 #1

SELECT MIN(timestamp),inJourney 
FROM (
SELECT *,
    ROW_NUMBER() OVER(ORDER BY timestamp)  - ROW_NUMBER() OVER(PARTITION BY inJourney ORDER BY timestamp) grp
  FROM journeys
) t1
GROUP BY grp,inJourney 
ORDER BY MIN(timestamp);

| min   | injourney |
| ----- | --------- |
| time1 | 1         |
| time4 | 0         |
| time6 | 1         |

View on DB Fiddle

【讨论】:

    猜你喜欢
    • 2017-03-09
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-07-10
    • 1970-01-01
    • 1970-01-01
    • 2021-10-06
    • 2023-04-06
    相关资源
    最近更新 更多