【发布时间】:2015-11-30 14:31:57
【问题描述】:
我们在 2 个节点上运行 RedHat 6.4。 我们已经安装了新的 Cloudera Manager 5.5.0 并且我们一直在尝试创建一个集群并向其中添加第一个节点(节点最初没有任何 Cloudera 组件)。不幸的是,在集群安装过程中,Cloudera Manager 每次都会卡在:
Installation failed. Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accessible on the Cloudera Manager Server (check firewall rules).
Ensure that ports 9000 and 9001 are not in use on the host being added.
Check agent logs in /var/log/cloudera-scm-agent/ on the host being added. (Some of the logs can be found in the installation details).
If Use TLS Encryption for Agents is enabled in Cloudera Manager (Administration -> Settings -> Security), ensure that /etc/cloudera-scm-agent/config.ini has use_tls=1 on the host being added. Restart the corresponding agent and click the Retry link here.
我们环顾四周,发现这通常是由配置错误的 /etc/hosts 文件引起的。所以我们在 Cloudera Manager 和新节点上编辑了我们的,做了 service network restart 以及 service cloudera-scm-server restart 但它也没有工作。 /etc/hosts 文件如下所示:
127.0.0.1 localhost
10.186.80.86 domain.node2.fr.net host
10.186.80.105 domain.node1.fr.net mgrnode
我们还通过删除 scm_prepare_node.* 和 .scm_prepare_node.lock 尝试在重新启动集群创建之前进行一些清理。
我们也在每次安装失败后查看了新节点上的service cloudera-scm-agent status,我们注意到该服务没有运行(即使我们重新启动了服务,结果还是一样)
service cloudera-scm-agent start
Starting cloudera-scm-agent: [ OK ]
service cloudera-scm-agent status
cloudera-scm-agent dead but pid file exists
这是新节点端的代理日志:
tail -f /var/log/cloudera-scm-agent/cloudera-scm-agent.log
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Agent Logging Level: INFO
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO No command line vars
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Missing database jar: /usr/share/java/mysql-connector-java.jar (normal, if you're not using this database type)
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type)
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Agent starting as pid 24529 user cloudera-scm(420) group cloudera-scm(207).
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Because agent not running as root, all processes will run with current user.
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent WARNING Expected mode 0751 for /var/run/cloudera-scm-agent but was 0755
[30/Nov/2015 15:07:27 +0000] 24529 MainThread agent INFO Re-using pre-existing directory: /var/run/cloudera-scm-agent
[30/Nov/2015 15:07:29 +0000] 24529 MainThread agent INFO Re-using pre-existing directory: /var/run/cloudera-scm-agent/cgroups
我们做错了什么吗? 提前感谢您的帮助!
【问题讨论】: