【问题标题】:inport csv file into hive table got null column name将 csv 文件导入配置单元表不为空列名
【发布时间】:2018-08-02 12:15:50
【问题描述】:

我是 hadoop hive 的新手。我正在使用开源 hadoop 2.7.1 hive 1.2.2。它安装在 ubuntu 单节点集群上。我在 csv 文件中有 106 行和 30 列数据。我使用以下代码将其导入配置单元表: CREATE TABLE clinicaldatabc (comp_tcga_id String, gender String, age_inti_diag int, ER_status String, PR_status String, HER2_final_status String, Tumor String, Tumor_T1_code String, Node String, Node_coded String, Metastasis String, Metastasis_coded String, AJCC_Stage String, Converted_stage String, Survival_dt_from String, Vital_Status String, d_to_date_of_last_contact int, d_to_Day_of_Death int, OS_event int,OS_time int, PAM50_mRNA String, SigClust_unsupervised_mRNA int, SigClust_intrinsic_mRNA int, miRNA_clusters int, methylation_clusters int,RPPA_clusters int, CN_clusters int, integrated_clusters_with_PAM50 int, integrated_cluster_no_exp int, integrated_clusters_unsup_exp int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

然后我得到空列名: first half of returns second half of returns

请帮助我如何解决它。提前谢谢!

【问题讨论】:

标签: hive


【解决方案1】:

NULL column names in Hive query result 可能重复

这里首先要注意的是,NULL 值出现在非字符串类型的列中

【讨论】:

    【解决方案2】:

    请参考

    CREATE EXTERNAL TABLE IF NOT EXISTS ejREGandTEST(
    DBN STRING,
    School_name STRING,
    Year_of_SHST INT,
    Grade_level INT,
    Enrollment INT,
    Number_of_registered INT,
    Number_students_SHSAT INT)
    row format delimited fields terminated by ','
    location "/user/ebin/kaggleData/csv"
    TBLPROPERTIES("skip.header.line.count"="1");
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-03-29
      • 1970-01-01
      • 2020-10-24
      • 2013-08-21
      • 1970-01-01
      • 2013-07-06
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多