环境为:Flink1.9.0 + Hadoop 2.8.5 +centos7
高可用Flink HA规划
| 主机 | ip | 说明 |
| centoshadoop1 | 192.168.227.140 | StandaloneSessionClusterEntrypoint(主节点进程名称) |
| centoshadoop2 | 192.168.227.141 | StandaloneSessionClusterEntrypoint(主节点进程名称) |
| centoshadoop3 | 192.168.227.142 | TaskManagerRunner(从节点进程名称) |
| centoshadoop4 | 192.168.227.143 | TaskManagerRunner(从节点进程名称) |
(一)要成功搭建Flink HA(基于YARN模式,生产环境基本上都是基于这种模式)高可用集群,首先需要搭建好Flink on yarn集群,可以参照:https://blog.csdn.net/u014635374/article/details/105704524 搭建
(二)搭建Flink HA具体步骤如下
Flink ON YARN集群搭建
Flink on YARN的HA利用了YARN的任务恢复机制
这里也需要利用到Zookeeper恢复机制,主要是因为Flink ON YARN的虽然依赖YARN的任务恢复机制,但是Flink任务在恢复时,需要依赖检查点产生的快照。而这些快照虽然配置在HDFS上,但是其元数据信息保存在Zookeeper中,所有我们还需要配置Zookeeper的信息
修改hadoop中的yarn-site.xml配置文件,设置提交应用程序的最大尝试次数
cd /home/hadoop/hadoop-ha/hadoop/hadoop-2.8.5/etc/hadoop
<!--设置提交任务的最大尝试次数 -->
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>4</value>
</property>
分发到个hadoop节点
scp -r yarn-site.xml [email protected]:
/home/hadoop/hadoop-ha/hadoop/hadoop-2.8.5/etc/hadoop
scp -r yarn-site.xml [email protected]3:
/home/hadoop/hadoop-ha/hadoop/hadoop-2.8.5/etc/hadoop
scp -r yarn-site.xml [email protected]4:
/home/hadoop/hadoop-ha/hadoop/hadoop-2.8.5/etc/hadoop
注意拷贝完后:要记得更(yarn-site.xml)centoshadoop2上的resourcemanager从节点的属性值
cd /home/hadoop/hadoop-ha/hadoop/hadoop-2.8.5/etc/hadoop
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm1</value>
<description>If we want to launch more than one RM in single node,we need this configuration</description>
</property>
把rm1更该为rm2
修改vi masters配置文件
centoshadoop1:8085
centoshadoop2:8085
分发该配置文件到各个flink节点
cd /home/hadoop/flink/flink-1.9.0/conf
scp -r masters [email protected]:/home/hadoop/flink/flink-1.9.0/conf
scp -r masters [email protected]3:/home/hadoop/flink/flink-1.9.0/conf
scp -r masters [email protected]:/home/hadoop/flink/flink-1.9.0/conf
修改flink-conf.yaml配置文件
high-availability: zookeeper
high-availability.zookeeper.quorum: node-1:2181,node-2:2181,node-3:2181
high-availability.storageDir: hdfs://mycluster/flink/cluster_yarn
cd /home/hadoop/flink/flink-1.9.0/conf
scp -r flink-conf.yaml [email protected]:/home/hadoop/flink/flink-1.9.0/conf
scp -r flink-conf.yaml [email protected]3:/home/hadoop/flink/flink-1.9.0/conf
scp -r flink-conf.yaml [email protected]2:/home/hadoop/flink/flink-1.9.0/conf
cd /home/hadoop/flink/flink-1.9.0/lib
scp -r flink-shaded-hadoop-2-uber-2.8.3-10.0.jar [email protected]:/home/hadoop/flink/flink-1.9.0/lib
scp -r flink-shaded-hadoop-2-uber-2.8.3-10.0.jar [email protected]:/home/hadoop/flink/flink-1.9.0/lib
scp -r flink-shaded-hadoop-2-uber-2.8.3-10.0.jar [email protected]4:/home/hadoop/flink/flink-1.9.0/lib
启动flink集群
bin/start-cluster.sh
执行下面命令测试:
./bin/flink run -m yarn-cluster -yn 4 ./examples/batch/WordCount.jar
部分日志日下:
2020-04-23 21:21:59,403 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated
2020-04-23 21:21:59,405 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED
2020-04-23 21:22:05,737 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.
Starting execution of program
Executing WordCount example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
(action,1)
(after,1)
(against,1)
(and,12)
高可以测试
在节点centoshadoop1上执行如下命令,停掉jobmanager,执行下面命令
bin/jobmanager stop
再次执行如下命令:
./bin/flink run -m yarn-cluster -yn 4 ./examples/batch/WordCount.jar
部分日志日下:
2020-04-23 21:21:59,403 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated
2020-04-23 21:21:59,405 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED
2020-04-23 21:22:05,737 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.
Starting execution of program
Executing WordCount example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
(action,1)
(after,1)
(against,1)
(and,12)
输入正常结果,说明基于YARN的高可用集群搭建成功
祝生活愉快! over!!!!