【问题标题】:Select the latest event with a distinct 2 columns in BigQuery在 BigQuery 中选择具有不同 2 列的最新事件
【发布时间】:2025-12-29 23:30:06
【问题描述】:

我有一个 BigQuery 表,其架构如下:

{
  {"name": "timeCreated", "type": "datetime"},
  {"name": "userid", "type": "string"},
  {"name": "textid", "type": "string"},
  {"name": "textvalue": "type": "float"}
}

我正在尝试进行查询,因此我最终得到了每对 userid 和 textid 组合的最新 timeCreated 行。我已经尝试过 GROUP BY 等,但我似乎无法通过 timeCreated 字段获取 ORDER,然后删除每对 userid 和 textid 列不在顶部的所有行。

【问题讨论】:

    标签: google-bigquery


    【解决方案1】:

    要在 Google BigQuery 中获取组的最新(最后)或最早(第一个)元素,您可以使用 ARRAY_AGG [OFFSET(0)] 和适当的 ORDER BY(DESC 或 ASC):

    WITH test_table AS (
      SELECT DATETIME '2020-11-01 01:00:00' AS timeCreated, 'user1' AS userid, 'text1' AS textid, 1.1 AS textvalue UNION ALL
      SELECT DATETIME '2020-11-01 03:00:00' AS timeCreated, 'user1' AS userid, 'text1' AS textid, 1.2 AS textvalue UNION ALL
      SELECT DATETIME '2020-11-01 02:00:00' AS timeCreated, 'user1' AS userid, 'text1' AS textid, 1.3 AS textvalue UNION ALL
      SELECT DATETIME '2020-11-01 02:00:00' AS timeCreated, 'user1' AS userid, 'text2' AS textid, 1.4 AS textvalue UNION ALL
      SELECT DATETIME '2020-11-01 01:00:00' AS timeCreated, 'user1' AS userid, 'text2' AS textid, 1.5 AS textvalue UNION ALL
      SELECT DATETIME '2020-11-01 00:00:00' AS timeCreated, 'user2' AS userid, 'text1' AS textid, 1.6 AS textvalue
    )
    SELECT 
      userid,
      textid,
      ARRAY_AGG(timeCreated ORDER BY timeCreated DESC)[OFFSET(0)] AS latest FROM test_table
    GROUP BY userid, textid
    

    【讨论】:

      【解决方案2】:

      以下是 BigQuery 标准 SQL

      #standardSQL
      select as value array_agg(t order by timeCreated desc limit 1)[offset(0)]
      from `project.dataset.table` t
      group by userid, textid
      

      【讨论】:

        最近更新 更多