【发布时间】:2017-05-26 07:39:53
【问题描述】:
我是 spark 新手,正在尝试设置 spark 集群。我做了以下事情来设置和检查火花集群的状态,但不确定状态。
我尝试在浏览器中检查 master-ip:8081 (8080, 4040, 4041),但没有看到任何结果。首先,我设置并启动了 hadoop 集群。
JPS gives:
2436 SecondaryNameNode
2708 NodeManager
2151 NameNode
5495 Master
2252 DataNode
2606 ResourceManager
5710 Jps
问题(有必要启动hadoop吗?)
在主 /usr/local/spark/conf/slaves 中
localhost
slave-node-1
slave-node-2
现在,启动 Spark;大师以
开头 $SPARK_HOME/sbin/start-master.sh
并用
测试 ps -ef|grep spark
hduser 5495 1 0 18:12 pts/0 00:00:04 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/*:/usr/local/hadoop/etc/hadoop/ -Xmx1g org.apache.spark.deploy.master.Master --host master-hostname --port 7077 --webui-port 8080
在从节点 1 上
$SPARK_HOME/sbin/start-slave.sh spark://205.147.102.19:7077
经过测试
ps -ef|grep spark
hduser 1847 1 20 18:24 pts/0 00:00:04 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://master-ip:7077
从节点 2 上相同
$SPARK_HOME/sbin/start-slave.sh spark://master-ip:7077
ps -ef|grep spark
hduser 1948 1 3 18:18 pts/0 00:00:03 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://master-ip:7077
我在 spark 的 web 控制台上看不到任何东西。所以我认为问题可能出在防火墙上。这是我的 iptables..
iptables -L -nv
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
6136 587K fail2ban-ssh tcp -- * * 0.0.0.0/0 0.0.0.0/0 multiport dports 22
151K 25M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
6 280 ACCEPT icmp -- * * 0.0.0.0/0 0.0.0.0/0
579 34740 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0
34860 2856K ACCEPT all -- eth1 * 0.0.0.0/0 0.0.0.0/0
145 7608 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
56156 5994K REJECT all -- * * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080
0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8081
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 REJECT all -- * * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain OUTPUT (policy ACCEPT 3531 packets, 464K bytes)
pkts bytes target prot opt in out source destination
Chain fail2ban-ssh (1 references)
pkts bytes target prot opt in out source destination
2 120 REJECT all -- * * 218.87.109.153 0.0.0.0/0 reject-with icmp-port-unreachable
5794 554K RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
我正在尽我所能查看 spark-cluster 是否已设置以及如何正确检查它。如果集群已设置,为什么我无法在 Web 控制台上进行检查?有什么问题?任何指针都会有所帮助...
编辑 - 在 spark-shell --master 本地命令之后添加日志(在主控中)
17/01/11 18:12:46 INFO util.Utils: Successfully started service 'sparkMaster' on port 7077.
17/01/11 18:12:47 INFO master.Master: Starting Spark master at spark://master:7077
17/01/11 18:12:47 INFO master.Master: Running Spark version 2.1.0
17/01/11 18:12:47 INFO util.log: Logging initialized @3326ms
17/01/11 18:12:47 INFO server.Server: jetty-9.2.z-SNAPSHOT
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@20f0b5ff{/app,null,AVAILABLE}
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@734e74b2{/app/json,null,AVAILABLE}
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1bc45d76{/,null,AVAILABLE}
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6a274a23{/json,null,AVAILABLE}
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4f5d45d5{/static,null,AVAILABLE}
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4fb65368{/app/kill,null,AVAILABLE}
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@76208805{/driver/kill,null,AVAILABLE}
17/01/11 18:12:47 INFO server.ServerConnector: Started ServerConnector@258dbadd{HTTP/1.1}{0.0.0.0:8080}
17/01/11 18:12:47 INFO server.Server: Started @3580ms
17/01/11 18:12:47 INFO util.Utils: Successfully started service 'MasterUI' on port 8080.
17/01/11 18:12:47 INFO ui.MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://master:8080
17/01/11 18:12:47 INFO server.Server: jetty-9.2.z-SNAPSHOT
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cfbb7e9{/,null,AVAILABLE}
17/01/11 18:12:47 INFO server.ServerConnector: Started ServerConnector@2f7af4e{HTTP/1.1}{master:6066}
17/01/11 18:12:47 INFO server.Server: Started @3628ms
17/01/11 18:12:47 INFO util.Utils: Successfully started service on port 6066.
17/01/11 18:12:47 INFO rest.StandaloneRestServer: Started REST server for submitting applications on port 6066
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@799d5f4f{/metrics/master/json,null,AVAILABLE}
17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@647c46e3{/metrics/applications/json,null,AVAILABLE}
17/01/11 18:12:47 INFO master.Master: I have been elected leader! New state: ALIVE
在从节点中-
17/01/11 18:22:46 INFO Worker: Connecting to master master:7077...
17/01/11 18:22:46 WARN Worker: Failed to connect to master master:7077
大量的 java 错误..
17/01/11 18:31:18 ERROR Worker: All masters are unresponsive! Giving up.
【问题讨论】:
标签: hadoop apache-spark cluster-computing iptables