【问题标题】:JSON Data Read in Hive Table在 Hive 表中读取 JSON 数据
【发布时间】:2016-11-24 12:47:28
【问题描述】:

我可以使用 JSON Serde org.openx.data.jsonserde.JsonSerDe 创建 Hive 表,但是当我从 Hive 表中读取数据时,我无法读取。

hive> create table emp (EmpId int , EmpFirstName string , EmpLastName string) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe';
OK
Time taken: 2.148 seconds

hive> LOAD DATA INPATH '/user/cloudera/EmpData/emp.json' INTO table emp;
Loading data to table employee.emp
chgrp: changing ownership of 'hdfs://quickstart.cloudera:8020/user/hive/warehouse/employee.db/emp/emp.json': User does not belong to supergroup
Table employee.emp stats: [numFiles=1, totalSize=4163]
OK
Time taken: 1.141 seconds

hive> select * from emp;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: A JSONObject text must end with '}' at 2 [character 3 line 1]
Time taken: 0.504 seconds

【问题讨论】:

    标签: json hadoop hive bigdata


    【解决方案1】:

    错误:失败并出现异常 java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException:行不是有效的 JSON 对象 - JSONException:JSONObject 文本必须以 ' 结尾}' 在 2 [字符 3 第 1 行]

    检查/user/cloudera/EmpData/emp.json中提供的json是否有效

    您可以通过

    消除无效行
    ALTER TABLE table emp SET SERDEPROPERTIES ( "ignore.malformed.json" = "true"); 
    

    检查此链接 -> https://github.com/rcongiu/Hive-JSON-Serde

    编辑: 这是无效的 json

    { "cols": [ "EmpId", "EmpFirstName", "EmpLastName" ], "data": [ [ 1, "Hannah", "Walton" ], [ 2, "Barrett", "Mendoza" ], [ 3, "Camden", "Kidd" ], [ 4, "Illiana", "Collier" ] ] }

    你提供的json有

    key:cols and value:[ "EmpId", "EmpFirstName", "EmpLastName" ]

    key :data and value :[ [ 1, "Hannah", "Walton" ], [ 2, "Barrett", "Mendoza" ], [ 3, "Camden", "Kidd" ], [ 4, "Illiana", "Collier" ]

    json 应该类似于

    {"EmpId":1,"EmpFirstName":"Hannah","EmpLastName":"Walton"}
    {"EmpId":2,"EmpFirstName":"Barrett","EmpLastName":"Mendoza"}
    {"EmpId":3,"EmpFirstName":"Camden","EmpLastName":"Kidd"}
    

    【讨论】:

    • 首先感谢您的回复。我尝试了您建议的选项,但现在所有字段都为“null”。我的 JSON 文件是正确的,我能够正确解析它。请在 Hive 表中查看导致问题的示例数据:{“cols”:[“EmpId”,“EmpFirstName”,“EmpLastName”],“数据”:[[1,“Hannah”,“Walton”],[ 2,“巴雷特”,“门多萨”],[3,“卡姆登”,“基德”],[4,“伊利亚娜”,“科利尔”]]}
    猜你喜欢
    • 2022-01-02
    • 1970-01-01
    • 1970-01-01
    • 2022-01-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-06-28
    相关资源
    最近更新 更多