【问题标题】:Running Total of all Previous Rows BigQuery运行所有先前行 BigQuery 的总计
【发布时间】:2023-04-10 15:50:02
【问题描述】:

我有一个如下所示的 BigQuery 表:

ID  SessionNumber  CountOfAction   Category
 1       1              1            B
 1       2              3            A
 1       3              1            A
 1       4              4            B
 1       5              5            B

我正在尝试获取 CountofAction 的所有先前行的总和,其中 category = A。最终输出应该是

 ID  SessionNumber  CountOfAction
 1       1              0   --no previous rows have countofAction for category = A
 1       2              0   --no previous rows have countofAction for category = A
 1       3              3   --previous row (Row 2) has countofAction = 3 for category = A
 1       4              4   --previous rows (Row 2 and 3) have countofAction = 3 and 1 for category = A
 1       5              4   --previous rows (Row 2 and 3) have countofAction = 3 and 1 for category = A

下面是我写的查询,但它没有给我想要的输出

 select 
 ID,
 SessionNumber ,
 SUM(CountofAction)  OVER(Partition by clieIDntid ORDER BY SessionNumber ROWS BETWEEN UNBOUNDED 
 PRECEDING AND 1 PRECEDING)as CumulativeCountofAction
 From TAble1 where category = 'A'

我非常感谢您对此的任何帮助!提前致谢

【问题讨论】:

    标签: sql group-by google-bigquery sum window-functions


    【解决方案1】:

    where 子句中过滤category 会驱逐(id, sessionNumber) 元组,其中category 'A' 没有出现,这不是您想要的。

    相反,您可以使用聚合和条件sum()

    select
        id,
        sessionNumber,
        sum(sum(if(category = 'A', countOfAction, 0))) over(
            partition by id 
            order by sessionNumber
            rows between unbounded preceding and 1 preceding
        ) CumulativeCountofAction
    from mytable t
    group by id, sessionNumber
    order by id, sessionNumber
    

    【讨论】:

    • 感谢您的支持,但我收到错误“未找到函数:sumif”
    • @TigSh:我很惊讶,我以为 BQ 中确实存在这个功能......反正已经修复了。
    • 谢谢它的工作!另外,如果不是 CountofAction 我需要做 count(Category) -----多少个类别的计数 = A,查询将如何改变。不确定我是否应该将此作为另一个问题发布。感谢您的帮助
    • @TigSh:你可以这样做:sum(countif(category = 'A')) over( partition by id order by sessionNumber rows between unbounded preceding and 1 preceding ) CumulativeCountofCategoryA
    【解决方案2】:

    以下是 BigQuery 标准 SQL

    #standardSQL
    SELECT ID, SessionNumber,   
      IFNULL(SUM(IF(category = 'A', CountOfAction, 0)) OVER(win), 0) AS CountOfAction
    FROM `project.dataset.table` 
    WINDOW win AS (ORDER BY SessionNumber ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
    

    如果应用到您的问题中的样本数据,如下例所示

    #standardSQL
    WITH `project.dataset.table` AS (
      SELECT 1 ID, 1 SessionNumber, 1 CountOfAction, 'B' Category UNION ALL
      SELECT 1, 2, 3, 'A' UNION ALL
      SELECT 1, 3, 1, 'A' UNION ALL
      SELECT 1, 4, 4, 'B' UNION ALL
      SELECT 1, 5, 5, 'B' 
    )
    SELECT ID, SessionNumber,   
      IFNULL(SUM(IF(category = 'A', CountOfAction, 0)) OVER(win), 0) AS CountOfAction
    FROM `project.dataset.table` 
    WINDOW win AS (ORDER BY SessionNumber ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)   
    

    结果是

    Row ID  SessionNumber   CountOfAction    
    1   1   1               0    
    2   1   2               0    
    3   1   3               3    
    4   1   4               4    
    5   1   5               4    
    

    【讨论】:

      猜你喜欢
      • 2015-08-09
      • 2013-04-17
      • 2013-01-17
      • 1970-01-01
      • 1970-01-01
      • 2022-10-14
      • 1970-01-01
      • 2015-04-02
      • 1970-01-01
      相关资源
      最近更新 更多