【发布时间】:2021-01-03 09:04:04
【问题描述】:
我有需要按时间分组的数据,间隔为 2 分钟。我的数据如下所示:
id time action_name url
111 2020-09-01-09:19:00 First www.stackoverflow/a12345
111 2020-09-01-09:19:04 Midpoint www.stackoverflow/a12345
111 2020-09-01-09:19:08 Third www.stackoverflow/a12345
112 2020-09-01-10:12:05 First www.someotherurl/a111111
111 2020-09-01-12:36:54 First www.stackoverflow/a12345
111 2020-09-01-12:36:58 Midpoint www.stackoverflow/a12345
111 2020-09-01-12:37:03 Third www.stackoverflow/a12345
111 2020-09-01-12:37:09 Complete www.stackoverflow/a12345
222 2020-09-01-15:17:44 First www.stackoverflow/a2222
222 2020-09-01-15:17:48 Midpoint www.stackoverflow/a2222
222 2020-09-01-15:18:05 Third www.stackoverflow/a2222
我需要获取具有以下条件的数据:如果x_id 和x_url 的action_name 列具有Complete 值,则获取该值。如果它没有Complete,则获取Third,依此类推。我目前拥有的代码每个x_id 和x_url 只返回一行。因此,我不仅需要按id 和url 对数据进行分组,还需要按时间对数据进行分组,间隔为2 minties。下面是代码:
SELECT AS VALUE
ARRAY_AGG(current_query_result
ORDER BY CASE action_name
WHEN 'Complete' THEN 1
WHEN 'Third' THEN 2
WHEN 'Midpoint' THEN 3
WHEN 'First' THEN 4
END
LIMIT 1
)[OFFSET(0)]
FROM (
SELECT
c.time,
c.id,
c.action_name,
c.url
FROM `bq_table` c
WHERE c.action_name in ('First', 'Midpoint', 'Third', 'Complete')
) current_query_result
GROUP BY id, url
期望的输出是:
id time action_name url
111 2020-09-01-09:19:08 Third www.stackoverflow/a12345
112 2020-09-01-10:12:05 First www.someotherurl/a111111
111 2020-09-01-12:37:09 Complete www.stackoverflow/a12345
222 2020-09-01-15:18:05 Third www.stackoverflow/a2222
我试过这个:TIMESTAMP_SECONDS(2*60 * DIV(UNIX_SECONDS(c.time), 2*60)) timekey但出现错误:No matching signature for function UNIX_SECONDS for argument types: STRING. Supported signature: UNIX_SECONDS(TIMESTAMP)
【问题讨论】:
标签: sql time group-by google-bigquery timestamp