【问题标题】:Sqoop export error - cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not existSqoop 导出错误 - 原因:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:输入路径不存在
【发布时间】:2013-02-15 14:23:49
【问题描述】:

我正在开发一个java程序。

java程序将数据从hive导出到mysql。

首先,我写代码

ProcessBuilder pb = new ProcessBuilder("sqoop-export", "export", 
         "--connect",               "jdbc:mysql://localhost/mydb", 
         "--hadoop-home",    "/home/yoonhok/development/hadoop-1.1.1", 
         "--table",                    "mytable", 
         "--export-dir",            "/user/hive/warehouse/tbl_2", 
         "--username",            "yoonhok", 
         "--password",            "1234");

try {
    Process p = pb.start();
    if (p.waitFor() != 0) {
        System.out.println("Error: sqoop-export failed.");
        return false;
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (InterruptedException e) {
    e.printStackTrace();
}

效果很好。

但我学会了一种在 java 中使用 sqoop 的新方法。

Sqoop 还不支持客户端 api。

所以我添加了 sqoop 库,只写了 Sqoop.run()

其次,我用新的方式重新编写代码。

String[] str = {"export", 
     "--connect",               "jdbc:mysql://localhost/mydb", 
     "--hadoop-home",    "/home/yoonhok/development/hadoop-1.1.1", 
     "--table",                    "mytable", 
     "--export-dir",            "/user/hive/warehouse/tbl_2", 
     "--username",            "yoonhok", 
     "--password",            "1234"
};

if (Sqoop.runTool(str) == 1) {
     System.out.println("Error: sqoop-export failed.");
     return false;
}

但它没有运行。

我有错误......

13/02/14 16:17:09 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
13/02/14 17:43:12 WARN sqoop.ConnFactory: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
13/02/14 16:17:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
13/02/14 16:17:09 INFO tool.CodeGenTool: Beginning code generation 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1 
Note: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
13/02/14 16:17:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.jar 
13/02/14 16:17:10 INFO mapreduce.ExportJobBase: Beginning export of tbl_2 
13/02/14 16:17:10 WARN mapreduce.ExportJobBase: Input path file:/user/hive/warehouse/tbl_2 does not exist 
13/02/14 16:17:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/02/14 16:17:11 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok314601126/.staging/job_local_0001 
13/02/14 16:17:11 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2 
13/02/14 16:17:11 ERROR tool.ExportTool: Encountered IOException running export job: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2

我看到$SQOOP_CONF_DIR has not been set in the environment.

所以我加了

SQOOP_CONF_DIR=/home/yoonhok/development/sqoop-1.4.2.bin__hadoop-1.0.0/conf

/etc/环境

再试一次,但是出错了……

13/02/14 16:17:09 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
13/02/14 16:17:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
13/02/14 16:17:09 INFO tool.CodeGenTool: Beginning code generation 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1 
Note: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
13/02/14 16:17:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.jar 
13/02/14 16:17:10 INFO mapreduce.ExportJobBase: Beginning export of tbl_2 
13/02/14 16:17:10 WARN mapreduce.ExportJobBase: Input path file:/user/hive/warehouse/tbl_2 does not exist 
13/02/14 16:17:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/02/14 16:17:11 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok314601126/.staging/job_local_0001 
13/02/14 16:17:11 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2 
13/02/14 16:17:11 ERROR tool.ExportTool: Encountered IOException running export job: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2

我认为 Export-dir 有问题。

我使用“/user/hive/warehouse/tbl_2”。

当我运行“hadoop fs -ls /user/hive/warehouse/”时,表“tbl_2”存在。

我认为

“输入路径不存在:文件:/user/hive/warehouse/tbl_2”不正常。

“输入路径不存在:hdfs:/user/hive/warehouse/tbl_2”没问题。

但我不知道该如何解决。


好的,就在我得到提示之前。

我编辑了 'export-dir'

--export-dir   hdfs://localhost:9000/user/hive/warehouse/tbl_2

但是...这是错误... T.T

13/02/15 15:17:20 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
13/02/15 15:17:20 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
13/02/15 15:17:20 INFO tool.CodeGenTool: Beginning code generation
13/02/15 15:17:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1
13/02/15 15:17:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1
13/02/15 15:17:20 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1/libexec/..
Note: /tmp/sqoop-yoonhok/compile/697590ee9b90c022fb8518b8a6f1d86b/tbl_2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/02/15 15:17:22 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/697590ee9b90c022fb8518b8a6f1d86b/tbl_2.jar
13/02/15 15:17:22 INFO mapreduce.ExportJobBase: Beginning export of tbl_2
13/02/15 15:17:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/02/15 15:17:23 INFO input.FileInputFormat: Total input paths to process : 1
13/02/15 15:17:23 INFO input.FileInputFormat: Total input paths to process : 1
13/02/15 15:17:23 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok922915382/.staging/job_local_0001
13/02/15 15:17:23 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:java.io.FileNotFoundException: File /user/hive/warehouse/tbl_2/000000_0 does not exist.
13/02/15 15:17:23 ERROR tool.ExportTool: Encountered IOException running export job: java.io.FileNotFoundException: File /user/hive/warehouse/tbl_2/000000_0 does not exist.

当我检查hdfs时,

hadoop fs -ls /user/hive/warehouse/tbl_2

hadoop fs -ls hdfs://localhost:9000/user/hive/warehouse/tbl_2

文件存在。

-rw-r--r-- 1 yoonhok 超级组 14029022 2013-02-15 12:16 /user/hive/warehouse/tbl_2/000000_0

我在终端的shell命令中尝试

sqoop-export --connect jdbc:mysql://localhost/detector --table tbl_2 --export-dir hdfs://localhost:9000/user/hive/warehouse/tbl_2 --username yoonhok --password 1234

成功了。

有什么问题?

我不知道。

你能帮我吗?

【问题讨论】:

    标签: hadoop export hive sqoop


    【解决方案1】:

    您需要加载并提供您的 Hadoop 配置文件。默认情况下,它们是从类路径中读取的,但您可以通过 Configuration.setDefaultResource 覆盖它(没有保证)。

    【讨论】:

    • 你的意思是HADOOP_HOME/conf/hdfs-site.xml、mapred-site.xml等等……对吧?
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多