【发布时间】:2021-08-09 09:13:02
【问题描述】:
我的 BigQuery 中有一个列,其中包含各种不同的消息,采用简单的单深度 JSON 格式,我想将其提取到 STRUCT 中。输入表看起来像
应该转化为
我知道诸如 JSON_EXTRACT 之类的 BigQuery json 函数,例如 here。但是,这种方法是毫无疑问的,因为在生产中存在 100 个不同的发送者。因此,我需要能够动态提取这些 JSON,而无需手动指定它们的键。
我一直在玩正则表达式,如图所示here
WITH input_table AS (
SELECT
1 AS Row,
20210101 AS Date,
'Sender1' AS Sender,
'{"param1": 123, "param2": 456, "param3": 78, "value1": 42, "label1": "hello", "timestamp": 1234567890}' AS Message
UNION ALL SELECT
2 AS Row,
20210101 AS Date,
'Sender2' AS Sender,
'{"value1": 4, "label1": "myLabel", "label2": "yourLabel"}' AS Message
UNION ALL SELECT
3 AS Row,
20210102 AS Date,
'Sender1' AS Sender,
'{"param1": 12, "param2": 90, "param3": 55, "value1": 11, "label1": "there", "timestamp": 1235555555}' AS Message
)
SELECT
CONCAT("SELECT ", key, " AS key, JSON_EXTRACT_SCALAR(Message, '$.", key, "') AS ", key, " FROM input_table")
FROM input_table, unnest(regexp_extract_all(regexp_replace(JSON_EXTRACT(Message, '$'), r':{.*?}+', ''), r'"(.*?)":')) key
但除了我的正则表达式仍然有点偏离之外,我正在努力将这些语句转换为 STRUCT。
【问题讨论】:
标签: json google-bigquery