【问题标题】:BigQuery: Querying with standard sqlBigQuery:使用标准 sql 进行查询
【发布时间】:2017-06-05 12:04:12
【问题描述】:

我有这张桌子:

client_id   session_id  time    action  transaction_id  
------------------------------------------------------
1   1   15:01   view    NULL    
1   1   15:02   basket  NULL    
1   1   15:03   basket  NULL    
1   1   15:04   purchase    1   
1   2   15:05   basket  NULL    
1   2   15:06   purchase    2   
1   2   15:07   view    NULL    

我想在会话中,为所有先前的操作注册第一次发生的 transaction_id(因此在 15:03 transaction_id = NULL)

session_id  time    transaction_id  
------------------------------------
1   15:01   1   
1   15:02   1   
1   15:03   NULL    
1   15:04   1   
2   15:05   2   
2   15:06   2   
2   15:07   NULL    

【问题讨论】:

    标签: sql google-bigquery bigquery-standard-sql


    【解决方案1】:

    嗯。 . .假设每个会话只有一个事务 id,那么您可以使用窗口函数:

    select t.*,
           (case when row_number() over (partition by client_id, session_id, action
                                         order by time) = 1
                 then max(transactc
    ion_id) over (partition by client_id, session_id)
            end) as new_transaction_id
    from t
    

    【讨论】:

    • 非常感谢您的回答!如果 session_id=1 中没有事务,但第一个“视图”(或另一个操作)在第一个 session_id 中,代码将如何变化。与他相反,显示 transaction_id = 2
    • @Zzema 。 . .如果会话中没有事务,则该值将是NULL,如您的问题中所指定:“我希望在会话中,所有先前的操作都注册第一次发生的 transaction_id ”。
    【解决方案2】:

    以下是 BigQuery 标准 SQL

    #standardSQL
    SELECT 
      client_id, session_id, time, action,
      (CASE 
        WHEN ROW_NUMBER() 
             OVER (PARTITION BY client_id, session_id, grp, action ORDER BY time) = 1
        THEN MAX(transaction_id) OVER (PARTITION BY client_id, session_id, grp) END
      ) AS transaction_id
    FROM (
      SELECT *, 
        COUNTIF(transaction_id IS NOT NULL) 
          OVER(PARTITION BY client_id, session_id 
          ORDER BY time ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS grp
      FROM YourTable
    )
    -- ORDER BY client_id, session_id, time  
    

    您可以使用如下的虚拟数据进行测试

    #standardSQL
    WITH YourTable AS (
      SELECT 1 AS client_id, 1 AS session_id, '15:01' AS time, 'view' AS action, NULL AS transaction_id UNION ALL
      SELECT 1, 1, '15:02', 'basket', NULL UNION ALL
      SELECT 1, 1, '15:03', 'basket', NULL UNION ALL
      SELECT 1, 1, '15:04', 'purchase', 1 UNION ALL
      SELECT 1, 1, '15:05', 'basket', NULL UNION ALL
      SELECT 1, 1, '15:06', 'basket', NULL UNION ALL
      SELECT 1, 1, '15:07', 'purchase', 3 UNION ALL
      SELECT 1, 2, '15:08', 'basket', NULL UNION ALL
      SELECT 1, 2, '15:09', 'purchase', 2 UNION ALL
      SELECT 1, 2, '15:10', 'view', NULL 
    )
    SELECT 
      client_id, session_id, time, action,
      (CASE 
        WHEN ROW_NUMBER() 
             OVER (PARTITION BY client_id, session_id, grp, action ORDER BY time) = 1
        THEN MAX(transaction_id) OVER (PARTITION BY client_id, session_id, grp) END
      ) AS transaction_id
    FROM (
      SELECT *, 
        COUNTIF(transaction_id IS NOT NULL) 
          OVER(PARTITION BY client_id, session_id 
          ORDER BY time ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS grp
      FROM YourTable
    )
    -- ORDER BY client_id, session_id, time  
    

    输出符合预期

    client_id   session_id  time    action      transaction_id   
    1           1           15:01   view        1    
    1           1           15:02   basket      1    
    1           1           15:03   basket      null     
    1           1           15:04   purchase    1    
    1           1           15:05   basket      3    
    1           1           15:06   basket      null     
    1           1           15:07   purchase    3    
    1           2           15:08   basket      2    
    1           2           15:09   purchase    2    
    1           2           15:10   view        null     
    

    【讨论】:

    • 非常感谢您的回答!如果 session_id=1 中没有事务,但第一个“视图”(或另一个操作)在第一个 session_id 中,代码将如何变化。与他相反,显示 transaction_id = 2
    • @Zzema - 我认为不需要更改代码 - 它仍然会产生您期望的结果(根据您的问题)- 您真的尝试过吗?
    • 是的,我试过了,谢谢)我的评论与问题中未写的更改条件有关......但是,在阅读了窗口函数之后,我想出了如何重新制作你的答​​案,再次感谢
    猜你喜欢
    • 1970-01-01
    • 2017-10-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多