【问题标题】:Running sum excluding rows with duplicate column value运行总和,不包括具有重复列值的行
【发布时间】:2021-12-27 03:44:07
【问题描述】:

示例表:

video encoding video time spent encoding bytes encoding bytes running sum video time spent running sum (expected) video time spent running sum (actual)
A 1 1 500 500 1 1
A 2 1 400 900 1 2
B 3 2 300 1200 3 5
B 4 2 200 1400 3 8
B 5 2 100 1500 3 11
B 6 2 100 1600 3 14
  • 视频花费时间栏有观看视频的时间;观看哪种编码并不重要。
  • 视频运行总和所花费的时间是我想要得到的。它应该只计算在视频级别花费的时间,忽略编码。

我想选择尽可能多的编码字节,同时保持在视频花费的总和

到目前为止我的查询:

SELECT *
FORM (
   SELECT 
      ...,
      SUM(encoding_bytes) OVER(ORDER BY encoding_bytes desc) AS encoding_bytes_running_sum, 
      SUM(video_time_spent) OVER (ORDER BY encoding_bytes desc) AS video_time_spent_running_sum
      ...
) 
WHERE video_time_spent_running_sum < X

但是 video_time_spent_running_sum 不够聪明,无法跳过同一视频中的其他编码。最好的方法是什么?

每个视频的编码数量不是恒定的。

创建表的脚本:

SELECT
    *,
    SUM(encoding_bytes) OVER(
        ORDER BY
            encoding_bytes DESC
    ) AS encoding_bytes_running_sum,
    SUM(video_time_spent) OVER (
        ORDER BY
            encoding_bytes DESC ROWS UNBOUNDED PRECEDING
    ) AS video_time_spent_running_sum
FROM (
    VALUES
        ('a', 1, 1, 500),
        ('a', 2, 1, 400),
        ('b', 3, 2, 300),
        ('b', 4, 2, 200),
        ('b', 5, 2, 100),
        ('b', 6, 2, 100)
) AS t (video, encoding, video_time_spent, encoding_bytes)

【问题讨论】:

  • 您是否遇到过视频(例如 A)对于不同编码具有不同时间的情况?
  • @DaleK 谢谢!修复了脚本。同一视频中的所有编码都花费相同的时间。

标签: sql sql-server tsql window-functions


【解决方案1】:

一种方法如下(我相信它可以简化);您使用ROW_NUMBER 函数仅计算每个视频的第一行。

WITH cte AS (
    SELECT
        *
        , SUM(encoding_bytes) OVER (ORDER BY encoding_bytes DESC) AS encoding_bytes_running_sum
        --, SUM(video_time_spent) OVER (ORDER BY encoding_bytes DESC ROWS UNBOUNDED PRECEDING) AS video_time_spent_running_sum
        , ROW_NUMBER() OVER (PARTITION BY video ORDER BY video, [encoding]) rn
    FROM (
        VALUES
            ('a', 1, 1, 500),
            ('a', 2, 1, 400),
            ('b', 3, 2, 300),
            ('b', 4, 2, 200),
            ('b', 5, 2, 100),
            ('b', 6, 2, 100)
    ) AS t (video, [encoding], video_time_spent, encoding_bytes)
)
SELECT video, [encoding], video_time_spent, encoding_bytes, encoding_bytes_running_sum
    , SUM(CASE WHEN rn = 1 THEN video_time_spent ELSE 0 END) OVER (ORDER BY video ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) video_time_spent_running_sum
FROM cte;

这会返回:

video encoding video_time_spent encoding_bytes encoding_bytes_running_sum video_time_spent_running_sum
a 1 1 500 500
a 2 1 400 900
b 3 2 300 1200
b 4 2 200 1400
b 5 2 100 1600
b 6 2 100 1600

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2014-12-08
    • 1970-01-01
    • 2019-06-06
    • 1970-01-01
    • 2023-02-16
    • 1970-01-01
    • 2020-04-08
    相关资源
    最近更新 更多