【发布时间】:2016-02-16 16:57:52
【问题描述】:
我是 Spark 流媒体的新手。
当我将 spark 流功能作为普通 scala 应用程序运行时,它按预期工作。
我能够捕获我的 kafka 事件并能够存储在 hdfs 本身中。
当我尝试使用 spark-submit 命令作为流媒体 jar 运行时,出现以下错误。
spark-submit --class Test --master yarn --executor-memory 20G --num-executors 50 spark-kafka-streaming-0.0.1-SNAPSHOT-jar-with-dependencies.jar
16/02/16 08:39:23 INFO scheduler.JobGenerator: Started JobGenerator at 1455640800000 ms
16/02/16 08:39:23 INFO scheduler.JobScheduler:已启动 JobScheduler 16/02/16 08:40:00 INFO utils.VerifiableProperties:验证属性
16/02/16 08:40:00 INFO utils.VerifiableProperties: Property group.id is overridden to
16/02/16 08:40:00 INFO utils.VerifiableProperties: Property zookeeper.connect is overridden to
16/02/16 08:40:00 ERROR actor.ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-2] shutting down ActorSystem [sparkDriver]
java.lang.NoSuchMethodError: org.apache.spark.streaming.kafka.DirectKafkaInputDStream.id()I
ache.spark.streaming.kafka.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:165)at org.ap
at ache.spark.streaming.kafka.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:165)at org.ap at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:300)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:300)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:299)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:287)
请帮帮我。
提前致谢。
【问题讨论】:
标签: apache-spark pyspark spark-streaming spark-dataframe akka-stream