【问题标题】:Flume HDFS Sink Write error "no protocol: value"Flume HDFS Sink 写入错误“无协议:值”
【发布时间】:2019-07-15 19:54:38
【问题描述】:

在尝试运行水槽作业时,我收到下面给出的错误。我在 cloudera 设置上运行它。

  • Kafka 是源代码
  • Morphline 用作拦截器,从中创建 avro 记录。
  • 接收器是 HDFS

测试环境中完全相同的文件(morphline、avro 模式等、flume 配置)。但是在另一个环境中它会抛出这个错误。

2019-07-15 14:24:17,669 WARN org.apache.flume.sink.hdfs.BucketWriter: Caught IOException writing to HDFSWriter (no protocol: value). Closing file (hdfs://8.8.8.8:8020/user/hive/warehouse/folder/folder/FlumeData.1563162656585.tmp) and rethrowing exception.
2019-07-15 14:24:17,670 INFO org.apache.flume.sink.hdfs.BucketWriter: Closing hdfs://8.8.8.8:8020/user/hive/warehouse/folder/folder/FlumeData.1563162656585.tmp
2019-07-15 14:24:17,670 ERROR org.apache.flume.sink.hdfs.HDFSEventSink: process failed
java.lang.NullPointerException
        at org.apache.flume.sink.hdfs.AvroEventSerializer.flush(AvroEventSerializer.java:187)
        at org.apache.flume.sink.hdfs.HDFSDataStream.close(HDFSDataStream.java:131)
        at org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:327)
        at org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:323)
        at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
        at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
        at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:698)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2019-07-15 14:24:17,671 ERROR org.apache.flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.NullPointerException
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:451)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
        at org.apache.flume.sink.hdfs.AvroEventSerializer.flush(AvroEventSerializer.java:187)
        at org.apache.flume.sink.hdfs.HDFSDataStream.close(HDFSDataStream.java:131)
        at org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:327)
        at org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:323)
        at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
        at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
        at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:698)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        ... 1 more

我能够在 Flume 上找到相关代码: https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java (line:602)

// write the event
try {
  sinkCounter.incrementEventDrainAttemptCount();
  callWithTimeout(new CallRunner<Void>() {
    @Override
    public Void call() throws Exception {
      writer.append(event); // could block
      return null;
    }
  });
} catch (IOException e) {
  LOG.warn("Caught IOException writing to HDFSWriter ({}). Closing file (" +
      bucketPath + ") and rethrowing exception.",
      e.getMessage());
  close(true);
  throw e;
}

错误:Caught IOException writing to HDFSWriter (no protocol: value). Closing file

我无法确定错误 no protocol: value 的含义。

在与 Flume 和 HDFS 相关的任何上下文中,我都找不到对此错误的任何引用。

【问题讨论】:

    标签: hadoop hdfs flume


    【解决方案1】:

    配置中缺少 Incerceptor 协议 -- 在解决问题的 Flume 配置文件中添加了“file:/”。

    类似问题参考:https://community.cloudera.com/t5/Data-Ingestion-Integration/Flume-HDFS-sink-error-quot-unknown-protocol-hdfs-quot/td-p/19344

    【讨论】:

    • 不,我已将 IP 和文件夹引用更改为虚拟引用。
    • 我在Cloudera社区发现了类似的问题:[Flume HDFS sink error: "unknown protocol: hdfs" ](community.cloudera.com/t5/Data-Ingestion-Integration/…)
    • 是的,我的配置也是同样的错误。谢谢! :)
    • 这是我的荣幸
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-09-07
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多