【问题标题】:Cloudera Manager installation failed to receive heartbeat from agent - to add new hosts to clusterCloudera Manager 安装无法从代理接收心跳 - 将新主机添加到集群
【发布时间】:2013-10-09 15:38:34
【问题描述】:

我尝试使用标准版本在 Ubuntu 12.04.1 LTS 上安装 cloudera 管理器,当我想添加新主机时出现下一个错误:

Installation failed.Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accesible on the Cloudera Manager server (check firewall rules).
Ensure that ports 9000 an 9001 are free on the host being added.
Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details).

/etc/hosts 文件中,我将其配置为:

127.0.0.1 localhost
127.0.0.1 hadoop-ubuntu
192.168.5.xyz hadoop-ubuntu.dana.local hadoop-ubuntu
192.168.3.xyz ro-m81.dana.local ro-m81
192.168.3.abc ro-m41.dana.local ro-m41

以下行适用于支持 IPv6 的主机

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters     
The **/var/log/cloudera-scm-agent/cloudera-scm-agent.log** shows the next error::   
[09/Oct/2013 16:04:23 +0000] 4532 MainThread agent ERROR Heartbeating to 192.168.5.xyz:7182 failed.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 747, in send_heartbeat
response = self.requestor.request('heartbeat', dict(request=heartbeat))
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 256, in issue_request
call_response = self.transceiver.transceive(call_request)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 485, in transceive
result = self.read_framed_message()
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 489, in read_framed_message
response = self.conn.getresponse()
File "/usr/lib64/python2.6/httplib.py", line 990, in getresponse
response.begin()
File "/usr/lib64/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.6/httplib.py", line 349, in _read_status
line = self.fp.readline()
File "/usr/lib64/python2.6/socket.py", line 433, in readline
data = recv(1)
error: [Errno 104] Connection reset by peer

请帮助我找出我收到此错误的原因或我缺少什么。

【问题讨论】:

  • 有没有人找到比编辑配置文件更好的解决方案?

标签: hadoop cloudera cloudera-manager


【解决方案1】:

我有同样的问题。这就是我的诀窍。

输入 ifconfig 并找到您的 IP 地址。不是 127.0.0.1。

输入 $hostname 并找到您的主机名

编辑 /etc/hosts 文件

在那里为您的 ipaddress 添加一个条目。像

192.168.8.xxx   hostname.test.com   hostname

重启cloudera服务。转到 sonic.test.com:7180 并重试。 它应该工作。就算没用也可以去http://hostname.test.com:7180/cmf/home查看主机状态。

事实证明,即使我收到心跳错误,主机实际上已启动并运行。

【讨论】:

    【解决方案2】:

    我遇到了同样的问题,然后我找到了解决方案

    我用了两台机器,一台用于master,另一台用于slave

    拥有cloudera-scm-server的主机。

    我在两台机器上都配置了/etc/hosts,终于报错了。

    主机IP为:192.168.1.10

    In Master Machine /etc/hosts
    
    127.0.0.1       localhost
    
    192.168.1.10     <hostname>
    

    从机IP为:192.168.1.8

    In Slave Machine /etc/hosts
    
    127.0.0.1       localhost
    
    192.168.1.8     <hostname>
    

    【讨论】:

      【解决方案3】:

      检查集群中所有节点上的主机文件后,确保在安装程序上打开端口 7180 和 7182,在集群节点(安装程序除外)上打开端口 9000。

      我从 Cloudera 安装中收到“检查器失败。抛出 IO 异常”错误,直到我查看安装程序(服务器)日志并看到客户端无法在端口 9000 上通信。

      【讨论】:

        【解决方案4】:

        我和你有同样的问题,我终于解决了。

        我的问题是代理的cloudera-scm-agent和服务器的cloudera-scm-server的版本不一样,你可以用dpkg或者yum自己检查。

        【讨论】:

          【解决方案5】:
          1. 首先使用“sudo service cloudera-scm-agent status”检查Cloudera scm代理状态是否正在运行

          2.查看/var/log/cloudera-scm-agent/这个目录下的代理日志文件

          解析资源:http://commandstech.com/what-is-heartbeat-in-hadoop-how-to-resolve-heartbeat-lost-in-cloudera-and-hortonworks/

          【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2014-08-04
            • 1970-01-01
            • 2012-01-24
            • 1970-01-01
            • 1970-01-01
            • 2020-07-03
            相关资源
            最近更新 更多