【发布时间】:2016-05-12 08:01:55
【问题描述】:
从 SQL Server 导入,数据类型转换不正确 堆栈:使用 Ambari 2.1 安装 HDP-2.3.2.0-2950
目标:
- 将表从 SQL Server 以 Avro 格式导入 HDFS
- 创建包含所有数据的外部 Hive Avro(SerDe) 表
- 创建 EXTERNAL Hive ORC 表并插入 ORC select * from Avro 表
- 删除 Avro 表并在 ORC 表上执行测试
其中一张桌子:
ECU_DTC_ID int
DTC_CDE nchar(20)
ECU_NAME nvarchar(15)
ECU_FAMILY_NAME nvarchar(15)
DTC_DESC nvarchar(MAX)
INSERTED_BY nvarchar(64)
INSERTION_DATE datetime
DTC_CDE_DECIMAL int
当我执行正常的 sqoop 导入时,日期时间转换为 long,nchar 和 nvarchar 转换为字符串。生成的 avsc 文件如图所示,当我创建 Hive Avro 表时,它不包含生成的 avro 文件,因此留下了一个空表:
{
"type" : "record",
"name" : "DimECUDTCCode",
"doc" : "Sqoop import of DimECUDTCCode",
"fields" : [ {
"name" : "ECU_DTC_ID",
"type" : [ "null", "int" ],
"default" : null,
"columnName" : "ECU_DTC_ID",
"sqlType" : "4"
}, {
"name" : "DTC_CDE",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "DTC_CDE",
"sqlType" : "-15"
}, {
"name" : "ECU_NAME",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "ECU_NAME",
"sqlType" : "-9"
}, {
"name" : "ECU_FAMILY_NAME",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "ECU_FAMILY_NAME",
"sqlType" : "-9"
}, {
"name" : "DTC_DESC",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "DTC_DESC",
"sqlType" : "-9"
}, {
"name" : "INSERTED_BY",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "INSERTED_BY",
"sqlType" : "-9"
}, {
"name" : "INSERTION_DATE",
"type" : [ "null", "long" ],
"default" : null,
"columnName" : "INSERTION_DATE",
"sqlType" : "93"
}, {
"name" : "DTC_CDE_DECIMAL",
"type" : [ "null", "int" ],
"default" : null,
"columnName" : "DTC_CDE_DECIMAL",
"sqlType" : "4"
} ],
"tableName" : "DimECUDTCCode"
我决定包含 --map-column-java :
sqoop import --connect 'jdbc:sqlserver://somedbserver;database=somedb' --username someusername--password somepassword --as-avrodatafile --num-mappers 8 --table DimECUDTCCode --map-column-java DTC_CDE=string,ECU_NAME=string,ECU_FAMILY_NAME=string,DTC_DESC=string,INSERTED_BY=string,INSERTION_DATE=timestamp --warehouse-dir /dataload/tohdfs/reio/odpdw/may2016 --verbose
但我收到以下错误:
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 ERROR tool.ImportTool: Imported Failed: No ResultSet method for Java type string
[sqoop@l1038lab root]$
我错过了什么?
【问题讨论】:
-
你可以试试
--map-column-hive,直接将SQL Server列映射到hive列。 -
但是为什么hive,我希望使用没有成功的java
-
是的,您应该尝试使用
--map-column-java查找问题。如果您遇到困难,我只是提供了一个替代方案,因为我尝试了--map-column-hive并且成功了。