【问题标题】:Zookeeper - Exception when following the leader java.lang.IllegalArgumentExceptionZookeeper - 跟随领导者时出现异常 java.lang.IllegalArgumentException
【发布时间】:2020-07-26 13:51:24
【问题描述】:

我在 AWS EKS 集群版本 1.15 中以复制模式(总共 3 个 zookeeper 节点)运行 Zookeeper 3.6.0。我正在从 Docker 中心拉动 zookeeper:latest image。

这是节点 1 (prd-zoo1) 的 zoo.cfg 文件。除了指定其他动物园服务器的最后三个条目外,其他节点具有类似的配置。

zoo.cfg Node 1:
dataDir=/data
dataLogDir=/datalog
tickTime=2000
initLimit=5
syncLimit=2
autopurge.snapRetainCount=3
autopurge.purgeInterval=0
maxClientCnxns=60
standaloneEnabled=true
admin.enableServer=true
server.1=0.0.0.0:2888:3888;2181
server.2=prd-zoo2:2888:3888;prd-zoo2:2181
server.3=prd-zoo3:2888:3888;prd-zoo3:2181

zoo.cfg Node 2:
<same as node1>
server.1=prd-zoo1:2888:3888;prd-zoo1:2181
server.2=0.0.0.0:2888:3888;2181
server.3=prd-zoo3:2888:3888;prd-zoo3:2181

zoo.cfg Node 3:
<same as node1>
server.1=prd-zoo1:2888:3888;prd-zoo1:2181
server.2=prd-zoo2:2888:3888;prd-zoo2:2181
server.3=0.0.0.0:2888:3888;2181

看来zoo节点之间可以相互通信并完成leader选举了。但是,当我检查日志时,我会看到经常出现的 java.lang.IllegalArgumentException 错误。我已验证每个服务都有其端点 IP 和端口 领导选举,3888,TCP 客户端, 2181, TCP 服务器, 2888, TCP

2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@857] - Peer state changed: following
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1453] - FOLLOWING
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@1246] - minSessionTimeout set to 4000
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@1255] - maxSessionTimeout set to 40000
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):ResponseCache@45] - Response cache size is initialized with value 400.
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):ResponseCache@45] - Response cache size is initialized with value 400.
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):RequestPathMetricsCollector@111] - zookeeper.pathStats.slotCapacity = 60
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):RequestPathMetricsCollector@112] - zookeeper.pathStats.slotDuration = 15
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):RequestPathMetricsCollector@113] - zookeeper.pathStats.maxDepth = 6
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):RequestPathMetricsCollector@114] - zookeeper.pathStats.initialDelay = 5
2020-04-13 17:20:17,793 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):RequestPathMetricsCollector@115] - zookeeper.pathStats.delay = 5
2020-04-13 17:20:17,794 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):RequestPathMetricsCollector@116] - zookeeper.pathStats.enabled = false
2020-04-13 17:20:17,794 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@1470] - The max bytes for all large requests are set to 104857600
2020-04-13 17:20:17,794 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@1484] - The large request threshold is set to -1
2020-04-13 17:20:17,794 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@329] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 clientPortListenBacklog -1 datadir /datalog/version-2 snapdir /data/version-2
2020-04-13 17:20:17,794 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):Follower@75] - FOLLOWING - LEADER ELECTION TOOK - 1381 MS
2020-04-13 17:20:17,794 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@863] - Peer state changed: following - discovery
2020-04-13 17:20:18,595 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@376] - Notification: my state:FOLLOWING; n.sid:3, n.state:LOOKING, n.leader:3, n.round:0x254, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, n.config version:0x0
2020-04-13 17:20:18,795 [myid:1] - WARN  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):Follower@129] - Exception when following the leader
java.lang.IllegalArgumentException
    at java.base/java.util.concurrent.ThreadPoolExecutor.<init>(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.<init>(Unknown Source)
    at java.base/java.util.concurrent.Executors.newFixedThreadPool(Unknown Source)
    at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:275)
    at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:87)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1455)

2020-04-13 17:20:18,795 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):Follower@292] - shutdown Follower
2020-04-13 17:20:18,795 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@863] - Peer state changed: looking
2020-04-13 17:20:18,795 [myid:1] - WARN  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1501] - PeerState set to LOOKING
2020-04-13 17:20:18,795 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1371] - LOOKING
2020-04-13 17:20:18,795 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):FastLeaderElection@931] - New election. My id = 1, proposed zxid=0x0
2020-04-13 17:20:18,795 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@376] - Notification: my state:LOOKING; n.sid:1, n.state:LOOKING, n.leader:1, n.round:0x254, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, n.config version:0x0
2020-04-13 17:20:18,796 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@376] - Notification: my state:LOOKING; n.sid:3, n.state:LOOKING, n.leader:3, n.round:0x254, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, n.config version:0x0
2020-04-13 17:20:18,796 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@376] - Notification: my state:LOOKING; n.sid:1, n.state:LOOKING, n.leader:3, n.round:0x254, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, n.config version:0x0
2020-04-13 17:20:18,797 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@376] - Notification: my state:LOOKING; n.sid:2, n.state:LEADING, n.leader:2, n.round:0x253, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, n.config version:0x0
2020-04-13 17:20:18,797 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@376] - Notification: my state:LOOKING; n.sid:2, n.state:LEADING, n.leader:2, n.round:0x253, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, n.config version:0x0
2020-04-13 17:20:18,997 [myid:1] - INFO  [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@857] - Peer state changed: following

它继续循环:完成领导者选举,错误跟随领导者,关闭,开始寻找,跟随......

【问题讨论】:

    标签: java kubernetes apache-zookeeper amazon-eks


    【解决方案1】:

    0.0.0.0 不是主机名;您似乎知道主机名,因为它在所有其他配置文件中被编码为 prd-zoo1prd-zoo2prd-zoo3,因此请输入其实际名称而不是无意义的 IP 地址

    【讨论】:

      【解决方案2】:

      EKS 脚本配置为从 Docker Hub 拉取最新的 Zookeeper 映像。截至目前,有一个新的 3.6.0 版本的镜像可用,它是被 pod 拉取和使用的镜像。这是问题的根本原因。当我降级到 3.5.7 版时,它就像我拥有的​​其他环境一样成功运行,而没有更改配置。

      【讨论】:

        猜你喜欢
        • 2017-06-07
        • 2018-11-13
        • 1970-01-01
        • 2021-12-09
        • 1970-01-01
        • 1970-01-01
        • 2017-10-31
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多