【问题标题】:Convert JSON with different types of objects (strings, array) to Map将具有不同类型对象(字符串、数组)的 JSON 转换为 Map
【发布时间】:2015-10-26 15:50:29
【问题描述】:

我正在尝试使用砖房库“brickhouse.udf.json.FromJsonUDF”将 JSON 对象转换为 Hive 中的地图对象。

问题是,我的 json 对象包含不同类型的值:字符串和另一个数组的一个数组。

我的 json 看起来像这样:

'{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}'

我可以使用以下方法正确读取数组元素的数组 (key4):

select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,array<array<string>>>') from my_table limit 1;

Which gives me:

{"key1":[],"key3":[],"key2":[],"key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}

你可以看到除了key4之外的所有元素都是空的。

或者我可以阅读其他元素,但 key4 使用:

select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,string>') from my_table limit 1;

Which gives me:

{"key1":"value1","key3":"value3","key2":"value2","key4":null}

但是如何将所有元素正确转换为结果映射对象上的键值对呢?

已编辑

我的实际数据是一个由两个组件组成的数组,它们是 json 对象:

[{"key1":"value1", "key2":"value2"}{"key3":"value3","key4":"value4","key5":"value5","key6":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}]

是否可以创建一个 struct 对象,其中包含两个 json 对象作为两个映射对象,以便我可以访问第一个或第二个结构元素,然后使用键选择相应映射对象的值?

例如:假设我想要的最终结果被称为struct_result,我将从第一个组件访问value1,例如:

struct_result.t1["key1"]

这会给我“value1”。

这个库可以实现吗?

【问题讨论】:

    标签: arrays json hive brickhouse


    【解决方案1】:

    这可以使用命名结构来完成。您需要创建一个 named_struct,并独立指定每个键的类型。

    例如

    select_from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}',
        named_struct("key1","", "key2", "", "key3", ""
            "key4", array(array("")))
    from my_table;
    

    这将使用“named_struct”UDF 创建一个模板对象,或者您可以使用等效的字符串类型定义。

    【讨论】:

      猜你喜欢
      • 2015-10-07
      • 2013-08-29
      • 1970-01-01
      • 1970-01-01
      • 2021-06-09
      • 2012-05-08
      • 2013-07-14
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多