【问题标题】:Kafka Spout did not read offsets from broker, only from Zookeeper after a certain messages readKafka Spout 没有从代理读取偏移量,仅在读取某些消息后从 Zookeeper 读取
【发布时间】:2018-06-24 22:19:56
【问题描述】:

我遇到了 Apache Storm 和 Kafka 的问题。 KafkaSpout 正常读取来自 Kafka 的消息,但是在大约 30,000 条消息之后,开始出现失败的元组,Bolt 没有收到任何消息。

我检查了worker.log,发现当拓扑启动时,它尝试从Zookeeper读取分区信息,然后在broker中读取分区信息,如您所见:offset 9539

Read partition information from: /twitter_streaming_tweet_test/STREAMING_TWEET_WRITER_SPOUT/partition_2  --> {"partition":2,"offset":9539,"topology":{"name":"DATA_WRITER_TOPOLOGY","id":"DATA_WRITER_TOPOLOGY-67-1516077955"},"topic":"twitter_streaming_tweet_test","broker":{"port":9092,"host":"zoo1"}}

2018-01-16 17:05:57.510 o.a.s.k.PartitionManager Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 从 zookeeper 读取上次提交偏移量:9539;旧的 topology_id:DATA_WRITER_TOPOLOGY-67-1516077955 - 新的 topology_id:DATA_WRITER_TOPOLOGY-68-1516089922 2018-01-16 17:05:57.514 o.a.s.k.PartitionManager Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 从偏移量 9539 开始 Kafka zoo1 分区{host=zoo1:9092, topic=twitter_streaming_tweet_test, partition=2} 2018-01-16 17:05:57.518 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 完成刷新

然后拓扑正常运行,直到大约30000条消息

2018-01-16 17:06:39.732 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850493570654209 was saved to database

2018-01-16 17:06:39.739 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor[6 6] [INFO] Tweet ID 952850099335348224 已保存到数据库 2018-01-16 17:06:39.742 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850787981393920 已保存到数据库 2018-01-16 17:06:39.753 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850152573685760 已保存到数据库 2018-01-16 17:06:39.754 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor[6 6] [INFO] Tweet ID 952850099578654721 已保存到数据库 2018-01-16 17:06:39.763 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850153173524481 已保存到数据库 2018-01-16 17:06:39.768 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor[6 6] [INFO] Tweet ID 952850099989704705 已保存到数据库 2018-01-16 17:06:39.776 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850153232154624 已保存到数据库 2018-01-16 17:06:39.779 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor[6 6] [INFO] Tweet ID 952850758289956864 已保存到数据库 2018-01-16 17:06:39.787 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850154436018176 已保存到数据库 2018-01-16 17:07:56.106 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 刷新分区管理器连接 2018-01-16 17:07:56.117 oaskDynamicBrokersReader Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 从 Zookeeper 读取分区信息:GlobalPartitionInformation{topic=twitter_streaming_tweet_test, partitionMap={0=zoo2:9092, 1=zoo3 :9092, 2=动物园1:9092}} 2018-01-16 17:07:56.117 oaskKafkaUtils Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 分配 [Partition{host=zoo1:9092, topic=twitter_streaming_tweet_test, partition=2} ] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 删除的分区管理器:[] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 新分区管理器:[] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 完成刷新 2018-01-16 17:09:54.150 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 刷新分区管理器连接 2018-01-16 17:09:54.160 oaskDynamicBrokersReader Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 从 Zookeeper 读取分区信息:GlobalPartitionInformation{topic=twitter_streaming_tweet_test, partitionMap={0=zoo2:9092, 1=zoo3 :9092, 2=动物园1:9092}} 2018-01-16 17:09:54.160 oaskKafkaUtils Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 分配 [Partition{host=zoo1:9092, topic=twitter_streaming_tweet_test, partition=2} ] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 删除的分区管理器:[] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 新分区管理器:[] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 完成刷新 2018-01-16 17:10:56.108 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] 任务 [3/3] 刷新分区管理器连接

Tweets 保存正常,然后 Kafka Spout 尝试从 Zookeeper 读取分区信息,但找不到任何东西,所以没有处理元组,拓扑卡住了。任何人都可以帮我解决这个问题。非常感谢。

【问题讨论】:

    标签: apache-kafka apache-storm kafka-consumer-api


    【解决方案1】:

    你能检查一下你的 max.spout.pending 值吗?一般来说,如果它被设置为非常高的值,那么最终,失败的元组将在一段时间后出现在风暴统计中,因为如果 max.spout.pending 非常高,消息会超时。如果您可以输入 spouts/bolts 的风暴统计数据以及 max.spout.pending 值,将有助于理解问题。

    【讨论】:

      猜你喜欢
      • 2014-09-22
      • 1970-01-01
      • 1970-01-01
      • 2015-04-20
      • 2017-05-12
      • 2016-09-09
      • 2019-12-13
      • 2015-07-30
      • 1970-01-01
      相关资源
      最近更新 更多