【问题标题】:Import from SQL Server, data types not converted properly从 SQL Server 导入,数据类型转换不正确
【发布时间】:2016-05-12 08:01:55
【问题描述】:

从 SQL Server 导入,数据类型转换不正确 堆栈:使用 Ambari 2.1 安装 HDP-2.3.2.0-2950

目标:

  • 将表从 SQL Server 以 Avro 格式导入 HDFS
  • 创建包含所有数据的外部 Hive Avro(SerDe) 表
  • 创建 EXTERNAL Hive ORC 表并插入 ORC select * from Avro 表
  • 删除 Avro 表并在 ORC 表上执行测试

其中一张桌子:

ECU_DTC_ID          int
DTC_CDE             nchar(20)
ECU_NAME            nvarchar(15)
ECU_FAMILY_NAME     nvarchar(15)
DTC_DESC            nvarchar(MAX)
INSERTED_BY         nvarchar(64)
INSERTION_DATE      datetime
DTC_CDE_DECIMAL     int

当我执行正常的 sqoop 导入时,日期时间转换为 long,nchar 和 nvarchar 转换为字符串。生成的 avsc 文件如图所示,当我创建 Hive Avro 表时,它不包含生成的 avro 文件,因此留下了一个空表:

{
  "type" : "record",
  "name" : "DimECUDTCCode",
  "doc" : "Sqoop import of DimECUDTCCode",
  "fields" : [ {
    "name" : "ECU_DTC_ID",
    "type" : [ "null", "int" ],
    "default" : null,
    "columnName" : "ECU_DTC_ID",
    "sqlType" : "4"
  }, {
    "name" : "DTC_CDE",
    "type" : [ "null", "string" ],
    "default" : null,
    "columnName" : "DTC_CDE",
    "sqlType" : "-15"
  }, {
    "name" : "ECU_NAME",
    "type" : [ "null", "string" ],
    "default" : null,
    "columnName" : "ECU_NAME",
    "sqlType" : "-9"
  }, {
    "name" : "ECU_FAMILY_NAME",
    "type" : [ "null", "string" ],
    "default" : null,
    "columnName" : "ECU_FAMILY_NAME",
    "sqlType" : "-9"
  }, {
    "name" : "DTC_DESC",
    "type" : [ "null", "string" ],
    "default" : null,
    "columnName" : "DTC_DESC",
    "sqlType" : "-9"
  }, {
    "name" : "INSERTED_BY",
    "type" : [ "null", "string" ],
    "default" : null,
    "columnName" : "INSERTED_BY",
    "sqlType" : "-9"
  }, {
    "name" : "INSERTION_DATE",
    "type" : [ "null", "long" ],
    "default" : null,
    "columnName" : "INSERTION_DATE",
    "sqlType" : "93"
  }, {
    "name" : "DTC_CDE_DECIMAL",
    "type" : [ "null", "int" ],
    "default" : null,
    "columnName" : "DTC_CDE_DECIMAL",
    "sqlType" : "4"
  } ],
  "tableName" : "DimECUDTCCode"

我决定包含 --map-column-java

sqoop import --connect 'jdbc:sqlserver://somedbserver;database=somedb' --username someusername--password somepassword --as-avrodatafile --num-mappers 8 --table DimECUDTCCode --map-column-java DTC_CDE=string,ECU_NAME=string,ECU_FAMILY_NAME=string,DTC_DESC=string,INSERTED_BY=string,INSERTION_DATE=timestamp --warehouse-dir /dataload/tohdfs/reio/odpdw/may2016 --verbose

但我收到以下错误:

16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 ERROR tool.ImportTool: Imported Failed: No ResultSet method for Java type string
[sqoop@l1038lab root]$

我错过了什么?

【问题讨论】:

  • 你可以试试--map-column-hive,直接将SQL Server列映射到hive列​​。
  • 但是为什么hive,我希望使用没有成功的java
  • 是的,您应该尝试使用--map-column-java 查找问题。如果您遇到困难,我只是提供了一个替代方案,因为我尝试了 --map-column-hive 并且成功了。

标签: hadoop hive sqoop avro


【解决方案1】:

事实证明,SQOOP 对STRINGStringstring 的处理方式不同。正确的方法是String

【讨论】:

    猜你喜欢
    • 2014-09-14
    • 1970-01-01
    • 2019-02-21
    • 1970-01-01
    • 2012-06-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-02-10
    相关资源
    最近更新 更多