【问题标题】:How to get adjacent value in an OVER() window如何在 OVER() 窗口中获取相邻值
【发布时间】:2020-11-23 00:13:26
【问题描述】:

我有以下数据和查询来获取带有MAX(wins) 的季节到当前季节:

WITH results as (
    SELECT 'DAL' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2011 as season, 10 as wins union
    SELECT 'DET' as team, 2012 as season, 4 as wins union
    SELECT 'DET' as team, 2013 as season, 7 as wins union
    SELECT 'DET' as team, 2014 as season, 11 as wins
) SELECT team, season, wins
    ,MAX(wins) OVER (PARTITION BY team ORDER BY season ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) max_wins_thus_far
FROM results;

# team, season, wins, max_wins_thus_far
DAL, 2010, 6, 6
DET, 2010, 6, 6
DET, 2011, 10, 10
DET, 2012, 4, 10
DET, 2013, 7, 10
DET, 2014, 11, 11

在这里我们可以看到,例如,对于 DET,2011 年的最大获胜次数为 10,因此从 2011 年到 2014 年,“max_wins”列是 10,当它具有更大的值时的11。不过,我想在那个时候获得最高总胜率的赛季。例如,结果如下所示:

# team, season, wins, max_wins_thus_far, season_with_max_wins_thus_far
DAL, 2010, 6, 6, 2010
DET, 2010, 6, 6, 2010
DET, 2011, 10, 10, 2011 <-- 2011 has the most wins for DET
DET, 2012, 4, 10, 2011
DET, 2013, 7, 10, 2011
DET, 2014, 11, 11, 2014 <-- now 2014 is the season with the most wins...

如何在分析函数中做到这一点?我能做的最好的事情是用数据构建一个对象,但不知道从那里去哪里:

# team, season, wins, max_wins_thus_far
DAL, 2010, 6, {"2010": 6}
DET, 2010, 6, {"2010": 6}
DET, 2011, 10, {"2010": 6, "2011": 10}
DET, 2012, 4, {"2010": 6, "2011": 10, "2012": 4}
DET, 2013, 7, {"2010": 6, "2011": 10, "2012": 4, "2013": 7}
DET, 2014, 11, {"2010": 6, "2011": 10, "2012": 4, "2013": 7, "2014": 11}

【问题讨论】:

  • 我想我们可以用lag做到这一点
  • 编辑您的问题并显示您想要的结果。
  • @GordonLinoff 它就在倒数第二个代码部分,不是吗?

标签: mysql sql window-functions gaps-and-islands analytic-functions


【解决方案1】:

您可以使用第二级窗口函数。只需抓住获胜次数最多的最近一个赛季:

SELECT r.*,
       MAX(CASE WHEN wins = max_wins_thus_far THEN season END) OVER (PARTITION BY team ORDER BY season) as max_season
FROM (SELECT team, season, wins,
             MAX(wins) OVER (PARTITION BY team ORDER BY season) as max_wins_thus_far
      FROM results
     ) r;

Here 是一个 dbfiddle。

【讨论】:

  • 这是一个非常简单的方法,谢谢!你会在窗口查询中说这是一种常见的模式吗,还是没有那么多?
  • @David542 。 . .我会说这是一种模式。我不确定它有多普遍。
  • 另一种方法是使用相关子查询,我认为从概念上来说,这对我来说更简单一些。 (SELECT season FROM results AS r_inner WHERE r_inner.season &lt;= results.season AND r_inner.team = results.team ORDER BY WINS DESC LIMIT 1) best_season -- 你认为哪种方法效果更好?
  • @David542 。 . .在大多数数据库中,窗口函数优于替代方法。 MySQL 可能是个例外。我建议您尝试这两种方法,看看哪种方法在您的数据上运行得更快。注意:如果数据很小(多达数千行),则应该没有太大区别。
【解决方案2】:

我们可以使用一些间隙和孤岛技术:想法是构建“相邻”记录组,其窗口总和在每次遇到大于所有先前值的win 时递增。然后我们可以使用一个窗口min()来恢复对应的季节(基本上就是每个岛屿的开始)。

select team, season, wins, 
    greatest(wins, max_wins_1) max_wins_thus_far,
    min(season) over(partition by team, grp order by season) as season_with_max_wins_thus_far
from (
    select r.*,
        sum(case when wins > max_wins_1 then 1 else 0 end) 
            over(partition by team order by season) as grp
    from (
        select r.*,
            max(wins) over (
                partition by team 
                order by season 
                rows between unbounded preceding and 1 preceding
            ) as max_wins_1
        from results r
    ) r
) r

另一种方法是关联子查询:

select team, season, wins, 
    max(wins) over(partition by team order by season) as max_wins_thus_far,
    (
        select r1.season
        from results r1 
        where r1.team = r.team and r1.season <= r.season
        order by r1.wins desc, r1.season
        limit 1
    ) as season_with_max_wins_thus_far
from results r

Demo on DB Fiddlde - 两个查询都产生:

团队 |季节 |胜利 | max_wins_thus_far | season_with_max_wins_thus_far :--- | -----: | ---: | ----------------: | ----------------------------------------: 达尔 | 2010 | 6 | 6 | 2010 检测 | 2010 | 6 | 6 | 2010 检测 | 2011 | 10 | 10 | 2011 检测 | 2012 | 4 | 10 | 2011 检测 | 2013 | 7 | 10 | 2011 检测 | 2014 | 11 | 11 | 2014

【讨论】:

  • 我明白了。在这两种方法中,您认为哪一种表现更好?我想你可以为这两个字段做相关的子查询,对吧?
【解决方案3】:

这当然是最简单的方法,但考虑到赛季和胜利都是数字,我们可以将它们加在一起并获得最大值(类似于 2024 通过添加 201014一起),然后通过减去该点的 max_wins 来检索季节。这是一个例子:

WITH results as (
    SELECT 'DAL' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2011 as season, 10 as wins union
    SELECT 'DET' as team, 2012 as season, 4 as wins union
    SELECT 'DET' as team, 2013 as season, 7 as wins union
    SELECT 'DET' as team, 2014 as season, 11 as wins
) 
SELECT team, season, wins,
    max(wins) OVER through_current AS max_wins_thus_far
   ,max(wins + season) OVER through_current - max(wins) OVER through_current AS season_with_max_wins_thus_far
FROM results
WINDOW through_current AS (PARTITION BY team ORDER BY season ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

# team, season, wins, max_wins_thus_far, season_with_max_wins_thus_far
DAL, 2010, 6, 6, 2010
DET, 2010, 6, 6, 2010
DET, 2011, 10, 10, 2011
DET, 2012, 4, 10, 2011
DET, 2013, 7, 10, 2011
DET, 2014, 11, 11, 2014

另一种方法是按季节

) SELECT *,
    (SELECT season FROM results AS r_inner
     WHERE r_inner.season <= results.season AND r_inner.team = results.team
     ORDER BY WINS DESC LIMIT 1) best_season
 FROM results;

【讨论】:

  • 等一下,您可以回答自己的帖子吗??
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2011-10-23
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多