在网上找到如下方案,监控 zk 的进程,如果进程不在,就重启 zk。
有种情况解决不了:当 zk 僵死的时候,进程还在,但是存在很多 CLOSE_WAIT 的 tcp 连接,导致 zk 连接不上!
#!/bin/sh while true; do time1=$(date) echo $time1 count=`ps -ef|grep zookeeper | grep -v grep` if [ "$?" != "0" ];then echo ">>>>zookeeper has shutdown" echo ">>>>restart zookeeper now !" sh zkServer.sh start else echo ">>>>zookeeper is runing..." fi sleep 60 done