【发布时间】:2015-10-26 15:50:29
【问题描述】:
我正在尝试使用砖房库“brickhouse.udf.json.FromJsonUDF”将 JSON 对象转换为 Hive 中的地图对象。
问题是,我的 json 对象包含不同类型的值:字符串和另一个数组的一个数组。
我的 json 看起来像这样:
'{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}'
我可以使用以下方法正确读取数组元素的数组 (key4):
select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,array<array<string>>>') from my_table limit 1;
Which gives me:
{"key1":[],"key3":[],"key2":[],"key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}
你可以看到除了key4之外的所有元素都是空的。
或者我可以阅读其他元素,但 key4 使用:
select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,string>') from my_table limit 1;
Which gives me:
{"key1":"value1","key3":"value3","key2":"value2","key4":null}
但是如何将所有元素正确转换为结果映射对象上的键值对呢?
已编辑:
我的实际数据是一个由两个组件组成的数组,它们是 json 对象:
[{"key1":"value1", "key2":"value2"}{"key3":"value3","key4":"value4","key5":"value5","key6":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}]
是否可以创建一个 struct 对象,其中包含两个 json 对象作为两个映射对象,以便我可以访问第一个或第二个结构元素,然后使用键选择相应映射对象的值?
例如:假设我想要的最终结果被称为struct_result,我将从第一个组件访问value1,例如:
struct_result.t1["key1"]
这会给我“value1”。
这个库可以实现吗?
【问题讨论】:
标签: arrays json hive brickhouse