【问题标题】:Apache kyline cube streaming build error no counters for jobApache kyline 多维数据集流构建错误没有作业计数器
【发布时间】:2019-01-22 12:15:33
【问题描述】:

我正在关注流立方体构建的教程
Kylin Cube from Streaming (Kafka)

所有属性都按照上述页面中的说明进行设置。
但是在触发构建立方体的同时。 第 1 步保存来自 Kafka 的数据失败
说:

org.apache.kylin.engine.mr.exception.MapReduceException: no counters for job job_1547096967734_0086
at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:173)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:70)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

我见过Apache kylin cube fails “no counters for job”
但是这里的用例是普通立方体构建,而不是通过 kafka 立方体构建流式传输。


ma​​pred-root-historyserver.log 中看到下面的条目似乎没有帮助。

2019-01-22 11:33:15,557 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: 
Loading job: job_1547096967734_0087 from file: 
hdfs://localhost:9000/tmp/hadoop- 
yarn/staging/history/done_intermediate/root/job_1547096967734_0087- 
1548149562328-root-Kylin_Save_Kafka_Data_kylin_streaming_cube_Step- 
1548149585065-0-0-FAILED-default-1548149566816.jhist
2019-01-22 11:33:15,557 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: 
Loading history file: [hdfs://localhost:9000/tmp/hadoop- 
yarn/staging/history/done_intermediate/root/job_1547096967734_0087- 
1548149562328-root-Kylin_Save_Kafka_Data_kylin_streaming_cube_Step- 
1548149585065-0-0-FAILED-default-1548149566816.jhist]
2019-01-22 11:33:15,572 INFOorg.apache.hadoop.mapreduce.jobhistory.
JobSummary:jobId=job_1547096967734_0087,submitTime=1548149562328
,launchTime=1548149566816,firstMapTaskLaunchTime=1548149570064,
firstReduceTaskLaunchTime=0,finishTime=1548149585065,resourcesPerMap
=1024,resourcesPerReduce=0,numMaps=1,numReduces=0,user=root,queue=
default,status=FAILED,mapSlotSeconds=8,reduceSlotSeconds=0,jobName=
Kylin_Save_Kafka_Data_kylin_streaming_cube_Step
2019-01-22 11:33:15,572 INFO org.apache.hadoop.mapreduce.v2.hs.
HistoryFileManager: Deleting JobSummary file: [hdfs://localhost:9000/
tmp/hadoop-yarn/staging/history/done_intermediate/
root/job_1547096967734_0087.summary]
2019-01-22 11:33:15,574 INFO 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Moving 
hdfs://localhost:9000/tmp/hadoop- 
yarn/staging/history/done_intermediate/root/job_1547096967734_0087- 
1548149562328-root-Kylin_Save_Kafka_Data_kylin_streaming_cube_Step- 
1548149585065-0-0-FAILED-default-1548149566816.jhist to 
hdfs://localhost:9000/tmp/hadoop- 
yarn/staging/history/done/2019/01/22/000000/job_1547096967734_0087- 
1548149562328-root-Kylin_Save_Kafka_Data_kylin_streaming_cube_Step- 
1548149585065-0-0-FAILED-default-1548149566816.jhist
2019-01-22 11:33:15,574 INFO 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Moving 
hdfs://localhost:9000/tmp/hadoop- 
yarn/staging/history/done_intermediate/root/job_1547096967734_0087_conf.xml 
to hdfs://localhost:9000/tmp/hadoop- 
yarn/staging/history/done/2019/01/22/000000/job_1547096967734_0087_conf.xml
2019-01-22 11:35:30,160 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: 
Starting scan to move intermediate done files

这是一个完全手动安装的kylin环境,下面是版本规格:

apache-hive-2.3.4-bin
apache-kylin-2.5.2-bin-hbase1x
hadoop-2.9.1
hbase-1.4.9
kafka_2.11-2.0.0
spark-2.3.2-bin-hadoop2.7
zookeeper-3.4.13

任何帮助将不胜感激。

【问题讨论】:

    标签: apache-spark apache-kafka mapreduce kylin


    【解决方案1】:

    您的环境似乎有问题。您可以查看错误消息的更多日志。你最好参考最新的文档http://kylin.apache.org/docs/tutorial/cube_streaming.html。如果你想快速启动 Kylin。建议您试用 Kylin 或使用集成沙箱(如 HDP 沙箱)进行开发,并确保其内存至少为 10 GB。

    【讨论】:

      【解决方案2】:

      请检查 MR 作业以了解 Yarn 上的第一个 Cubing 步骤。在工作中,您可以深入到每个映射器的日志中,然后您应该能够在那里看到一些异常。通常,可能的原因包括“无法连接Kafka”、“无法加载Kafka客户端jar”等。

      【讨论】:

        【解决方案3】:

        我们能够通过在纱线共享库中提供 kafka-client-2.0.0.jar 来修复它。正如 mapreduce 作业日志所说,找不到 kafka 的 class def。

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2020-09-22
          • 2017-08-04
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多