【问题标题】:Flume transactions to HBase failingFlume 到 HBase 的事务失败
【发布时间】:2025-12-17 08:25:01
【问题描述】:

我有一个 Flume 代理将推文写入 HBase 接收器。

几秒钟后,到接收器的事务失败,每隔 8-10 秒我就会在 Flume 代理日志中收到这些错误消息,告诉我到 HBase 的事务失败。

奇怪的是,一些推文仍然通过并进入 HBase 表。这可能是什么原因造成的? 这是在单节点 Cloudera Quickstart VM 上运行的,会不会是资源问题?

这是代理日志

9:20:44.618 PM  ERROR   org.apache.flume.SinkRunner     

Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Could not write events to Hbase. Transaction failed, and rolled back.
    at org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:245)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:662)

9:20:53.883 PM  ERROR   org.apache.flume.SinkRunner     

Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Could not write events to Hbase. Transaction failed, and rolled back.
    at org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:245)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:662)

这些是调试日志中的一些奇怪的东西,也许有关系?

2014-03-06 09:39:12,069 DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient: Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration

2014-03-06 09:39:12,298 DEBUG org.apache.zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x144965080900029 : Unable to read additional data from server sessionid 0x144965080900029, likely server has closed socket

这是我的代理配置

TwitterAgent.sinks.HBaseTweet.channel = MemChannel
TwitterAgent.sinks.HBaseTweet.type = org.apache.flume.sink.hbase.AsyncHBaseSink
TwitterAgent.sinks.HBaseTweet.table = tweets
TwitterAgent.sinks.HBaseTweet.columnFamily = tweet
TwitterAgent.sinks.HBaseTweet.batchSize = 100
TwitterAgent.sinks.HBaseTweet.serializer = flume_hdfs.hbase.util.AsyncHbaseTwitterEventSerializer 
TwitterAgent.sinks.HBaseTweet.serializer.columns = tweet:id,tweet:created_at,tweet:source,tweet:favourited,tweet:text
TwitterAgent.sinks.HBaseTweet.serializer.delimiter = \\t

TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 200
TwitterAgent.channels.MemChannel.transactionCapacity = 100

停止代理时日志中的一些指标可能很有趣

Component type: CHANNEL, name: MemChannel stopped

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.start.time == 1394093630078

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.stop.time == 1394093894804

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.capacity == 200

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.current.size == 125

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.attempt == 220

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.success == 209

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.attempt == 3059

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.success == 9

Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Could not write events to Hbase. Transaction failed, and rolled back.
    at org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:245)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:662)

Component type: SINK, name: HBaseTweet stopped

Shutdown Metric for type: SINK, name: HBaseTweet. sink.start.time == 1394093630407

Shutdown Metric for type: SINK, name: HBaseTweet. sink.stop.time == 1394093894833

Shutdown Metric for type: SINK, name: HBaseTweet. sink.batch.complete == 27

Shutdown Metric for type: SINK, name: HBaseTweet. sink.batch.empty == 0

Shutdown Metric for type: SINK, name: HBaseTweet. sink.batch.underflow == 7

Shutdown Metric for type: SINK, name: HBaseTweet. sink.connection.closed.count == 1

Shutdown Metric for type: SINK, name: HBaseTweet. sink.connection.creation.count == 1

Shutdown Metric for type: SINK, name: HBaseTweet. sink.connection.failed.count == 0

Shutdown Metric for type: SINK, name: HBaseTweet. sink.event.drain.attempt == 3053

Shutdown Metric for type: SINK, name: HBaseTweet. sink.event.drain.sucess == 9

HBase 区域服务器错误

2014-03-08 09:37:44,371 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column family retweet does not exist in region tweets,,1394029330397.953f602dd0790637df8106720396f219. in table 'tweets', {NAME => 'entities', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'retweeted_status', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'tweet', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'user', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
    at org.apache.hadoop.hbase.regionserver.HRegion.checkFamily(HRegion.java:5475)
    at org.apache.hadoop.hbase.regionserver.HRegion.checkFamilies(HRegion.java:3022)
    at org.apache.hadoop.hbase.regionserver.HRegion.internalPut(HRegion.java:2900)
    at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2083)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:2239)
    at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:323)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428)

【问题讨论】:

  • 我对HBase不太熟悉,但是能不能看看HBase的日志有没有什么异常?
  • 我在 Regionserver 日志中发现了一些内容,并将其包含在我的问题中。我想这(github.com/AronMacDonald/Twitter_Hbase_Impala)并不像我希望的那样稳定:)
  • 好吧,根据错误消息,似乎存在模式不匹配。那里的某个人期望在表tweets 中有一个retweet 列族。我不明白这是从哪里来的,因为你似乎没有在你的水槽中提到这个。无论如何,如果你查看你指向的那个 github 项目的源代码,它在几个地方提到了"retweet",而列族的名称实际上是retweeted_status。也许这是源代码中的错误。如果可以,请尝试更改它并重新编译该项目以查看它是否会消失。
  • 您好 Daniel,将列族名称更改为“转推”确实修复了错误,如果您将此作为答案发布,我可以将其标记为正确答案。谢谢!

标签: hadoop hbase cloudera apache-zookeeper flume


【解决方案1】:

来自 HBase 日志的错误消息表明存在架构不匹配,特别是代理希望有一个名为 retweet 的列族,而架构实际上指定了 retweeted_status

解决方案是重新编译代理以使用正确的列族名称,或者更改架构以使用代理期望的名称。我不知道哪个修复更正确;如果您自己定义了此架构,那么您很可能只需更改列族名称。但是,如果架构是在外部定义的(即:通过某些脚本或从某处遵循特定指令),重命名列族可能会破坏其他取决于名称 retweeted_status 的内容。在这种情况下,应修复 Twitter_HBase_Impala 的源代码以使用正确的名称。

【讨论】:

    最近更新 更多