【问题标题】:Kafka on Kubernetes - UNKNOWN_TOPIC_OR_PARTITION and LEADER_NOT_AVAILABLE errorKubernetes 上的 Kafka - UNKNOWN_TOPIC_OR_PARTITION 和 LEADER_NOT_AVAILABLE 错误
【发布时间】:2018-05-11 16:44:01
【问题描述】:

这是this 的后续问题。 我设法做到了以下几点:

  1. 为我的 5 个代理 Kafka 集群创建一个无头服务,用于代理间通信
  2. 为每个代理设置一项服务
    1. 每个服务都有一个外部 ip
    2. 每个服务只选择一个 pod,例如服务“kafka-0-es”选择 pod“kafka-0”
  3. pod 正确地通告了它们各自的外部 IP。我通过访问 ZooKeeper CLI 上的数据验证了这一点。

我使用 zkCli 创建了一个主题 test-topic 并验证它已创建。之后,我开始了 Kafka 控制台生产者。

.\kafka-console-producer.bat --broker-list EXTERNAL_IP_1:9093,EXTERNAL_IP_2:9093,EXTERNAL_IP_3:9093,EXTERNAL_IP_4:9093,EXTERNAL_IP_5:9093 --topic test-topic --property parse.key=true --property key.
separator=:
>afkjdshasdkfjhsdkjsf:128379127893123
>[2018-05-09 17:35:51,622] WARN [Producer clientId=console-producer] Got error produce response with correlation id 9 on topic-partition test-topic-0, retrying (2 attempts left). Error: UNKNOWN_TOPIC_OR_PARTITION (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,623] WARN [Producer clientId=console-producer] Received unknown topic or partition error in produce request on partition test-topic-0. The topic/partition may not exist or the user may not have Describe access to it (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,649] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 10 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:35:51,720] WARN [Producer clientId=console-producer] Got error produce response with correlation id 11 on topic-partition test-topic-0, retrying (1 attempts left). Error: UNKNOWN_TOPIC_OR_PARTITION (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,720] WARN [Producer clientId=console-producer] Received unknown topic or partition error in produce request on partition test-topic-0. The topic/partition may not exist or the user may not have Describe access to it (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,773] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 12 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:35:51,823] WARN [Producer clientId=console-producer] Got error produce response with correlation id 13 on topic-partition test-topic-0, retrying (0 attempts left). Error: UNKNOWN_TOPIC_OR_PARTITION (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,823] WARN [Producer clientId=console-producer] Received unknown topic or partition error in produce request on partition test-topic-0. The topic/partition may not exist or the user may not have Describe access to it (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,913] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 14 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:35:51,936] ERROR Error when sending message to topic test-topic with key: 20 bytes, value: 15 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.
[2018-05-09 17:35:51,945] WARN [Producer clientId=console-producer] Received unknown topic or partition error in produce request on partition test-topic-0. The topic/partition may not exist or the user may not have Describe access to it (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:52,034] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 16 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:35:52,161] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 20 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:40:52,288] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 25 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)

根据 Zookeeper 的说法,我的 Kafka 代理“kafka-2”是该主题的领导者:

get /kafka/brokers/topics/test-topic/partitions/0/state

{"controller_epoch":5,"leader":2,"version":1,"leader_epoch":0,"isr":[2,1]} 

但是 pod kafka-2 在日志中抛出错误

[2018-05-09 15:21:02,524] ERROR [ReplicaFetcherThread-0-2], Error for partition [test-topic,0] to broker 2:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. (kafka.server.ReplicaFetcherThread)

不太清楚为什么会发生这种情况,配置看起来不错。为了让我的 Kafka 集群在 Kubernetes 上运行,我还缺少什么?

请注意,我也尝试过彻底擦除我的集群(缩小 kafka 集群,删除 kafka 存储,缩小 zk 集群,删除 zk 存储,扩大 zk,扩大 kafka)但无济于事。

【问题讨论】:

    标签: apache-kafka kubernetes


    【解决方案1】:

    我刚刚修好了。问题是我的无头服务包含 internalexternal 端口。

    现在,我的无头服务只包含内部端口:

    apiVersion: v1
    kind: Service
    metadata:
      name: kafka-hs
      labels:
        app: kafka
    spec:
      ports:
      - port: 29092
        name: server
      clusterIP: None
      selector:
        app: kafka
    

    我的每个 pod-services 公开外部 ip 包含外部端口(请注意,RedHat OpenShift 脚本会处理将外部 ips 分配给这些服务,这未包含在服务定义中):

    apiVersion: v1
    kind: Service
    metadata:
      name: kafka-es-4
      labels:
        app: kafka
      namespace: whatever
    spec:
      ports:
      - port: 9093
        name: kafka-port
        protocol: TCP
      selector:
        statefulset.kubernetes.io/pod-name: kafka-4
        app: kafka
      type: LoadBalancer
    

    【讨论】:

      猜你喜欢
      • 2018-03-01
      • 2017-02-04
      • 1970-01-01
      • 2021-05-15
      • 1970-01-01
      • 1970-01-01
      • 2021-05-03
      • 2021-11-08
      • 2019-05-18
      相关资源
      最近更新 更多