【问题标题】:Oozie Jobs NOT Running - Getting SUSPENDEDOozie 作业未运行 - 被暂停
【发布时间】:2025-12-15 05:00:02
【问题描述】:

我在伪模式下使用 Oozie 运行 Hadoop(我没有使用 CDH 或 Hortonworks 等所说的任何 hadoop 发行版)。我在运行时有以下配置 - Fedora 22 VM 在 VirtualBox 上运行,RAM 分配 4GB,Hadoop 2.7,Oozie 4.2

在我提交 OOZIE 的示例 Mapreduce 作业 后,它会 SUSPENDED 并出现以下作业错误,

2015-10-29 15:44:59,048  WARN ActionStartXCommand:523 - SERVER[hadoop] USER[hadoop] GROUP[-] TOKEN[] APP[map-reduce-wf] JOB[0000000-151029154441128-OOZIE-VB-W] ACTION[0000000-151029154441128-OOZIE-VB-W@mr-node] Error starting action [mr-node]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1024
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)]
org.apache.oozie.action.ActionExecutorException: JA009: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1024
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:456)
at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:440)
at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1132)
at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1286)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
at org.apache.oozie.command.XCommand.call(XCommand.java:286)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

我认为这与 MapReduce 作业的内存分配有关,但我无法弄清楚这背后的确切数学。非常感谢您的帮助。

ma​​pred-site.xml

  <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
  </property>

  <property>
      <name>mapreduce.map.memory.mb</name>
      <value>512</value>
  </property>

  <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>512</value>
  </property>

  <property>
      <name>mapreduce.jobtracker.address</name>
      <value>http://localhost:50031</value>
  </property>

  <property>
      <name>mapreduce.jobtracker.http.address</name>
      <value>http://localhost:50030</value>
  </property>

  <property>
      <name>mapreduce.jobtracker.jobhistory.location</name>
      <value>/home/osboxes/hadoop/logs/jobhistory</value>
  </property>

  <property>
      <name>mapreduce.jobhistory.address</name>
      <value>http://localhost:10020</value>
  </property>

  <property>
     <name>mapreduce.jobhistory.intermediate-done-dir</name>
     <value>/home/osboxes/hadoop/mr-history/temp</value>
  </property>

  <property>
     <name>mapreduce.jobhistory.done-dir</name>
     <value>/home/osboxes/hadoop/mr-history/done</value>
  </property>

  <property>
     <name>mapreduce.cluster.local.dir</name>
     <value>/home/osboxes/hadoop/dfs/local</value>
  </property>

  <property>
     <name>mapreduce.jobtracker.system.dir</name>
     <value>/home/osboxes/hadoop/dfs/system</value>
  </property>

yarn-site.xml

  <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
      <description> Execution Framework </description>
  </property>

  <property>
      <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

  <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>1024</value>
  </property>  

2015 年 10 月 30 日编辑

core-site.xml

<property>
  <name>fs.defaultFS</name>
  <value>hdfs://localhost:9000</value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/hadoop-${user.name}</value>
</property>
<property>
  <name>hadoop.proxyuser.hadoop.hosts</name>
  <value>*</value>
</property>

<property>
  <name>hadoop.proxyuser.hadoop.groups</name>
  <value>*</value>
</property>

【问题讨论】:

  • AFAIK 涉及 RAM 的作业属性是 mapreduce.map.memory.mb plus mapreduce.map.java.opts(可能包含 -X reqs)、mapreduce.reduce.memory.mb plus mapreduce.reduce.java.optsyarn.app.mapreduce.am.resource.mb plus yarn.app.mapreduce.am.command-opts - - 如果您启用了公平或容量调度程序,则可以选择yarn.scheduler.minimum-allocation-mb / yarn.scheduler.maximum-allocation-mb
  • 然后在使用 Hive 和/或 Tez 时,hive.tez.container.size 加上 hive.tez.java.optstez.am.resource.memory.mb 加上 tez.am.java.opts
  • @SamsonScharfrichter - 我知道围绕内存分配的属性,但我无法为这些属性获得完美的工作数字,因此工作运行良好而没有任何错误。如果您可以向我指出解决方案,那将很有帮助,而不是让我研究这不是这个平台的目的。

标签: hadoop oozie


【解决方案1】:

看起来像用户组权限问题尝试以 Oozie USER 身份运行。

【讨论】:

  • 我不这么认为,否则它会在日志中说一些关于“身份验证失败”的内容,我在任何地方都找不到。虽然我正在为核心站点 .xml 添加一个编辑以防万一。