【发布时间】:2015-10-22 07:17:48
【问题描述】:
我正在谷歌云 Hadoop 环境中运行猪脚本 pig -useHCatalog -x mapreduce -f profile.pig 我有两个表,每个表有 50,000 条记录,它们将被交叉并与一个有 10,00,000 条的表连接。我运行相同的脚本,但记录较少,它运行得很好,但是当我增加记录数时,它会引发此错误。
2015-10-22 05:38:56,261 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:56,266 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history serv
er
2015-10-22 05:38:56,377 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to recreate exceptio
n from backed error: Container killed on request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
2015-10-22 05:38:56,377 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2015-10-22 05:38:56,380 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0.2.2.8.0-3150 0.14.0.2.2.8.0-3150 hdfs 2015-10-22 05:34:17 2015-10-22 05:38:56 HASH_JOIN,GROUP_BY,FILTER,CROSS,UNION
Some jobs have failed! Stop running all dependent jobs
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outp
uts
job_1444401341866_0534 1 0 11 11 11 11 0 0 0 0 t_female,t_male,t_profile MULTI_QUERY,MAP_ONLY
job_1444401341866_0535 2 1 6 6 6 6 91 91 91 91 t_female,t_male,t_raid_female,t_raid_female1
job_1444401341866_0536 2 1 25 25 25 25 89 89 89 89 t_female,t_male,t_raid_male,t_raid_male1
job_1444401341866_0537 2 0 5 5 5 5 0 0 0 0 t_female,t_male,t_mf_union MAP_ONLY
Failed Jobs:
JobId Alias Feature Message Outputs
job_1444401341866_0538 j_ci1,j_mf_education,j_mf_height,j_mf_occupation,j_mf_religion,j_mf_weight,t_ci1,t_mf_transpose,t_mf_union,t_raid HASH_JOIN Message: Job failed!
Input(s):
Successfully read 10001 records (1509451 bytes) from: "matrimony.profile_gce_limit"Input(s):
Successfully read 10001 records (1509451 bytes) from: "matrimony.profile_gce_limit"
Output(s):
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1444401341866_0534 -> job_1444401341866_0535,job_1444401341866_0536,job_1444401341866_0537,
job_1444401341866_0535 -> job_1444401341866_0538,
job_1444401341866_0536 -> job_1444401341866_0538,
job_1444401341866_0537 -> job_1444401341866_0538,
job_1444401341866_0538 -> null,
null -> null,
null -> null,
null
2015-10-22 05:38:56,455 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/timeline/
2015-10-22 05:38:56,456 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:56,459 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-10-22 05:38:56,572 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/timeline/
2015-10-22 05:38:56,572 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:56,576 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-10-22 05:38:56,675 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/timeline/
2015-10-22 05:38:56,676 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:56,679 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-10-22 05:38:56,780 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/timeline/
2015-10-22 05:38:56,780 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:56,783 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-10-22 05:38:56,883 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/timeline/
2015-10-22 05:38:56,883 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:56,886 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-10-22 05:38:56,981 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/Backend error message
timeline/
2015-10-22 05:38:56,982 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:56,985 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history s
erver
2015-10-22 05:38:57,083 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/
timeline/
2015-10-22 05:38:57,083 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:57,086 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history s
erver
2015-10-22 05:38:57,182 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/
timeline/
2015-10-22 05:38:57,182 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:57,185 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history s
erver
2015-10-22 05:38:57,275 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/
timeline/
2015-10-22 05:38:57,275 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:57,278 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history s
erver
2015-10-22 05:38:57,370 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/
timeline/
2015-10-22 05:38:57,370 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:57,373 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history s
erver
2015-10-22 05:38:57,475 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/
timeline/
2015-10-22 05:38:57,475 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:57,478 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history s
erver
2015-10-22 05:38:57,570 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hadoop-w-0.c.horton-cluster-3.internal:8188/ws/v1/
timeline/
2015-10-22 05:38:57,570 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop-w-0.c.horton-cluster-3.internal/10.240.0.3:8050
2015-10-22 05:38:57,574 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history s
erver
2015-10-22 05:38:57,601 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Some jobs have failed! Stop running all dependent jobs
2015-10-22 05:38:57,602 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate exception from backed error: Container killed on request. Exit code is
137
Container exited with a non-zero exit code 137
Killed by external signal
Details at logfile: /home/hdfs/workfile/pig_1445492051329.log
2015-10-22 05:38:57,603 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
Details at logfile: /home/hdfs/workfile/pig_1445492051329.log
2015-10-22 05:38:57,603 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
Details at logfile: /home/hdfs/workfile/pig_1445492051329.log
2015-10-22 05:38:57,603 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
Details at logfile: /home/hdfs/workfile/pig_1445492051329.log
2015-10-22 05:38:57,623 [main] INFO org.apache.pig.Main - Pig script completed in 4 minutes, 46 seconds and 401 milliseconds (286401 ms)
And this is what it is in the log file
================================================================================
Pig Stack Trace
---------------
ERROR 2244: Job failed, hadoop does not return any error message
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:179)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:495)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
================================================================================
Pig Stack Trace
---------------
ERROR 2244: Job failed, hadoop does not return any error message
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:179)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:495)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
================================================================================
【问题讨论】:
-
这可能是您作业的内存分配问题。您是否能够设置 mapreduce java 选项以提供更多内存?如果您能够在 MR 上看到作业日志,您可能会收到有关正在发生的事情的更具体的消息。
-
嘿@JasonS,我是hadoop的新手,我只是在堆栈溢出的帮助下才走到这一步。你能更具体地说明该怎么做吗?
-
我运行为 pig -useHCatalog -x local script.pig 它给了.java.lang.Exception: org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device在 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 在 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) 原因:org.apache.hadoop .fs.FSError: java.io.IOException: org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:248) java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82 设备上没有剩余空间)
-
Maharaj:我只是想知道为什么我们会看到这么多工作:job_1444401341866_0534 job_1444401341866_0535 job_1444401341866_0536 job_1444401341866_0537 在 Job Stats 中。你能提供更多信息吗?而您在上面发布的错误:设备上没有剩余空间”只会在您的本地服务器空间几乎已满时发生。
-
所以问题是我正在加载两个表 table1 有 100000 条记录而 table2 在表一中有 1000000 条记录我做了非旋转加入做一些基本的配置单元的东西然后加入 table2 然后聚合它们,当我对较小的数据集执行相同的例程时,它可以正常工作,但对于较大的数据集,情况并非如此
标签: hadoop mapreduce apache-pig