【问题标题】:Flume not writing logs to HdfsFlume 未将日志写入 Hdfs
【发布时间】:2014-12-18 12:27:39
【问题描述】:

所以我配置了flume将我的apache2访问日志写入hdfs ...正如我从flume的日志中发现的那样,所有配置都是正确的,但我不知道为什么它仍然没有写入hdfs . 所以这是我的水槽配置文件

#agent and component of agent
search.sources = so
search.sinks = si
search.channels = sc

# Configure a channel that buffers events in memory:
search.channels.sc.type = memory
search.channels.sc.capacity = 20000
search.channels.sc.transactionCapacity = 100


# Configure the source:
search.sources.so.channels = sc
search.sources.so.type = exec
search.sources.so.command = tail -F /var/log/apache2/access.log

# Describe the sink:
search.sinks.si.channel = sc
search.sinks.si.type = hdfs
search.sinks.si.hdfs.path = hdfs://localhost:9000/flumelogs/
search.sinks.si.hdfs.writeFormat = Text
search.sinks.si.hdfs.fileType = DataStream
search.sinks.si.hdfs.rollSize=0
search.sinks.si.hdfs.rollCount = 10000
search.sinks.si.hdfs.batchSize=1000
search.sinks.si.rollInterval=1

这是我的水槽日志

14/12/18 17:47:56 INFO node.AbstractConfigurationProvider: Creating channels
14/12/18 17:47:56 INFO channel.DefaultChannelFactory: Creating instance of channel sc   type memory
14/12/18 17:47:56 INFO node.AbstractConfigurationProvider: Created channel sc
14/12/18 17:47:56 INFO source.DefaultSourceFactory: Creating instance of source so, type exec
14/12/18 17:47:56 INFO sink.DefaultSinkFactory: Creating instance of sink: si, type: hdfs
14/12/18 17:47:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/12/18 17:47:56 INFO hdfs.HDFSEventSink: Hadoop Security enabled: false
14/12/18 17:47:56 INFO node.AbstractConfigurationProvider: Channel sc connected to [so, si]
14/12/18 17:47:56 INFO node.Application: Starting new configuration:{ sourceRunners:{so=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:so,state:IDLE} }} sinkRunners:{si=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@3de76481 counterGroup:{ name:null counters:{} } }} channels:{sc=org.apache.flume.channel.MemoryChannel{name: sc}} }
14/12/18 17:47:56 INFO node.Application: Starting Channel sc
14/12/18 17:47:56 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: sc: Successfully registered new MBean.
14/12/18 17:47:56 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: sc started
14/12/18 17:47:56 INFO node.Application: Starting Sink si
14/12/18 17:47:56 INFO node.Application: Starting Source so
14/12/18 17:47:56 INFO source.ExecSource: Exec source starting with command:tail -F /var/log/apache2/access.log
14/12/18 17:47:56 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: si: Successfully registered new MBean.
14/12/18 17:47:56 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: si started
14/12/18 17:47:56 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: so: Successfully registered new MBean.
14/12/18 17:47:56 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: so started

这是命令,我用来启动水槽

flume-ng agent -n search -c conf -f ../conf/flume-conf-search 

我在 hdfs 中有一条路径

       hadoop fs -mkdir hdfs://localhost:9000/flumelogs

但我不知道为什么它没有写入 hdfs ..我可以看到 apache2 的访问日志 ..但是 flume 没有将它们发送到 hdfs /flumelogs 目录..请帮忙! !

【问题讨论】:

    标签: hadoop hdfs flume flume-ng


    【解决方案1】:

    我认为这不是权限问题,当 Flume 刷新到 HDFS 时,您会看到异常。这个问题有两个可能的原因:

    1) 缓冲区中没有足够的数据,flume 认为它还不需要刷新。您的接收器批量大小为 1000,您的通道容量为 20000。要验证这一点,请 CTRL -C 您的水槽进程,这将强制该进程刷新到 HDFS。

    2) 更可能的原因是您的 exec 源运行不正常。这可能是由于 tail 命令的路径问题。在命令中添加 tail 的完整路径,例如 /bin/tail -F /var/log/apache2/access.log 或 /usr/bin/tail -F /var/log/apache2/access.log (取决于您的系统)检查

    which tail 
    

    寻找正确的路径。

    【讨论】:

    • 嗨,erik ..所以我使用 CTRL-C 强制执行水槽进程,我做了很多次,但水槽日志目录没有显示与此相关的日志文件 ..它完全空白,我也尝试使用which tail 并根据我的系统添加了 /usr/bin/tail -f 但即便如此它也没有工作......
    【解决方案2】:

    请检查一下这个文件夹的权限:hdfs://localhost:9000/flumelogs/

    我的猜测是flume没有写入那个文件夹的权限

    【讨论】:

    • 这里是flumelogs dir和flume文件夹的权限..... drwxr-xr-x - hduser supergroup 0 2014-12-18 17:56 hdfs://localhost:9000/flumelogs drwxrwxr -x 7 hduser hadoop 4096 12 月 9 日 13:57 水槽
    • 您的水槽代理正在以水槽用户身份运行,并且似乎没有写入该目录的权限。要么将该目录的所有权更改为 Flume 用户,要么将其更改为 777,然后尝试
    • 水槽不应该抱怨没有写入目录的权限而不是默默地失败吗?
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-09-07
    • 2017-09-30
    • 1970-01-01
    • 2019-03-21
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多