Redshift 不适用于 JSON,尤其不适用于任意 JSON 键(如 @GMB 所述)。嵌套数据结构也不好。
所以实际上,你有两个问题:
-
提取 json 密钥。我在这里看到 2 个选项:
-
将一组键取消嵌套到表中。将数据嵌套到行中有一个技巧(请参阅下面查询中的
CROSS JOIN 和seq 表)- 在this SO answer 中进行了描述。
1。 python UDF的解决方案
可以在python中实现json解析,注册为用户定义函数https://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.html
功能:
create or replace function f_py_json_keys (a varchar(65535))
returns varchar(65535)
stable
as $$
import json
return ",".join(json.loads(a).keys())
$$ language plpythonu;
查询:
with input(json) as (
select '{
"0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {
"alert": "345",
"channel": "ios_push",
"name": "Variant 1"
},
"4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {
"channel": "ios_push",
"name": "Control Group",
"type": "control"
}
}'::varchar
), seq(idx) as (
select 1 UNION ALL
select 2 UNION ALL
select 3 UNION ALL
select 4 UNION ALL
select 5
), input_with_occurences as (
select f_py_json_keys(json) as keys,
regexp_count(keys, ',') + 1 as number_of_occurrences
from input
)
select
split_part(keys, ',', idx) as id
from input_with_occurences cross join seq
where idx <= number_of_occurrences
2。使用 REGEX 魔法的解决方案
Redshift 有一些正则表达式函数。这是一个可以为您指定的有效负载完成工作的工作示例:
with input(json) as (
select '{
"0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {
"alert": "345",
"channel": "ios_push",
"name": "Variant 1"
},
"4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {
"channel": "ios_push",
"name": "Control Group",
"type": "control"
}
}'::varchar
), seq(idx) as (
select 1 UNION ALL
select 2 UNION ALL
select 3 UNION ALL
select 4 UNION ALL
select 5
), input_with_occurences as (
select *,
regexp_count(json,
'\\{?\\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\\":\\s\\{[\\w\\s\\":,]+\\}') as number_of_occurrences
from input
)
select
REGEXP_SUBSTR(json, '\\{?\\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\\":\\s\\{[\\w\\s\\":,]+\\}', 1, idx, 'e') as id
from input_with_occurences cross join seq
where idx <= number_of_occurrences
结果如下:
+------------------------------------+
|id |
+------------------------------------+
|0fc8a2a1-e334-43b8-9311-ce46da9cd32c|
|4344d89b-7f0d-4453-b2c5-d0d4a39d7d25|
+------------------------------------+