【发布时间】:2019-06-07 11:02:11
【问题描述】:
我有一个 kafka 集群(3 台机器,每台机器上运行 1 个 zookeeper 和 1 个代理) 我正在使用 kafka_exporter 来监控消费者滞后指标,它在正常情况下工作正常。 但是,当我杀死 1 个代理时,Prometheus 无法从 http://machine1:9308/metric(kafka_exporter 指标端点)获取指标,因为获取数据需要很长时间(1,5m),所以它会超时。 现在,如果我重新启动 kafka_exporter,我会看到一些错误:
Cannot get leader of topic __consumer_offsets partition 20: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes
当我运行命令时:kafka-topics.bat --describe --zookeeper machine1:2181,machine2:2181,machine3:2181 --topic __consumer_offsets 结果是:
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:1 Configs:compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets Partition: 0 Leader: -1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 49 Leader: 2 Replicas: 2 Isr: 2
这是配置错误吗?在这种情况下,我怎样才能得到消费者的滞后? “领导者:-1”是一个错误?如果我永远关闭机器 1,它仍然可以正常工作吗?
【问题讨论】:
标签: apache-kafka cluster-computing grafana prometheus