【问题标题】:Load data to hive array of struct将数据加载到结构的配置单元数组
【发布时间】:2019-03-01 19:54:37
【问题描述】:

我的 CSV 数据看起来像

David,"""SMARTPHONE,6""|""COMPUTER,3""|""LAPTOP,1"""

我尝试将其加载到我的蜂巢表中

create table user_device(name string, devices array<struct<devicename: string, number : int>>) 
FIELDS TERMINATED BY ','
collection items terminated by '|'
STORED AS TEXTFILE
LOCATION 'maprfs:///user/david/';

我期待看到

[{"devicename":"SMARTPHONE","number":6},{"devicename":"COMPUTER","number":3},{"devicename":"LAPTOP","number":1}]

但是当我尝试查询表时,我看到 struct 的数组是

[{"devicename":"\"\"\"SMARTPHONE","number":null}]

数组和结构的其余部分都消失了。

有谁知道我如何做到这一点?

谢谢 大卫

【问题讨论】:

    标签: hive hiveql


    【解决方案1】:

    他是我使用的密码。在进行 HQL 查询之前,我使用 python 进行清理。所以在做了一些争论之后,我的本地文件系统中有一个像下面这样的文件(没有indicesheaders),因为它是一个小文件:

    import pandas as pd
    import numpy as np 
    
        Name  devicename number
    0  David  SMARTPHONE      6
    1           COMPUTER      3
    2             LAPTOP      1
    

    然后创建一个临时表tempt 并使用来自 LFS 或 HDFS 的数据填充:

    create table tempt
    (
    name       string,
    devicename string,
    number     int
    )
    row format delimited 
    FIELDS TERMINATED BY ',';
    load data local inpath '/path_to_file' overwrite into table tempt;
    
    select * from tempt;
    +--------------------+--------------------------+----------------------+--+
    | tempt.name         | tempt.devicename         | tempt.number         |
    +--------------------+--------------------------+----------------------+--+
    | David              | SMARTPHONE               | 6                    |
    |                    | COMPUTER                 | 3                    |
    |                    | LAPTOP                   | 1                    |
    +--------------------+--------------------------+----------------------+--+
    

    现在

    Insert overwrite table user_device
    select name,
    array(named_struct("devicename",devicename,"number",number)) from tempt;
    
    select * from user_device;
    

    现在的输出和你预期的一样。

    +-----------------+-------------------------------------------+--+
    |user_device.name |            user_device.devices            |
    +-----------------+-------------------------------------------+--+
    | David           | [{"devicename":"SMARTPHONE","number":6}]  |
    |                 | [{"devicename":"COMPUTER","number":3}]    |
    |                 | [{"devicename":"LAPTOP","number":1}]      |
    +-----------------+-------------------------------------------+--+
    

    干杯!

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-12-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-10-11
      相关资源
      最近更新 更多