【问题标题】:java.lang.NoClassDefFoundError: kafka/common/TopicAndPartitionjava.lang.NoClassDefFoundError: kafka/common/TopicAndPartition
【发布时间】:2016-06-24 22:33:06
【问题描述】:

在我的代码中执行以下命令时:

kafka_streams = [KafkaUtils.createStream(ssc, zk_settings['QUORUM'], zk_settings['CONSUMERS'][k],
                                              {zk_settings['TOPICS'][0]: zk_settings['NUM_THREADS']})
                           .window(zk_settings['WINDOW_DURATION'], zk_settings['SLIDE_DURATION'])
                 for k in range(len(zk_settings['CONSUMERS']))]

但我收到以下错误:

Exception in thread "Thread-3" java.lang.NoClassDefFoundError: kafka/common/TopicAndPartition
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2625)
at java.lang.Class.privateGetPublicMethods(Class.java:2743)
at java.lang.Class.getMethods(Class.java:1480)
at py4j.reflection.ReflectionEngine.getMethodsByNameAndLength(ReflectionEngine.java:365)
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:317)
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342)
at py4j.Gateway.invoke(Gateway.java:252)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: kafka.common.TopicAndPartition
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 12 more

我错过了什么吗?

我遇到了一些火花错误,所以我重建了火花错误并导致了这个错误。

【问题讨论】:

  • 如何构建和运行您的应用程序?
  • 检查类名 TopicAndPartition 的拼写,在 common 包内。
  • @Poonam Agrawal - 你是如何解决这个问题的?我也面临着和你以前一样的问题。

标签: java apache-spark apache-kafka


【解决方案1】:

您应该在提交代码时添加--packages

 ./bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.2.0  <DIR>/main.py localhost:9092 test

https://spark.apache.org/docs/latest/streaming-kafka-0-8-integration.html

【讨论】:

  • 感谢您的回答有助于摆脱类似问题。
【解决方案2】:

我也遇到这个问题的原因是我下载的 spark-streaming-kafka jar 不是 assembly jar。我通过执行以下操作解决了这个问题:

  1. 首先下载程序集 spark-streaming-kafka...jar

    wget https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka-0-8-assembly_2.11/2.2.0/spark-streaming-kafka-0-8-assembly_2.11-2.2.0.jar
    

对于我来说,我使用的是 spark-2.2.0,因此请尝试访问 URL https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka-0-8-assembly_2.11 以查看需要下载的相应 jar。

  1. 然后获取

    spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.2.0 --jars /path/to/spark-streaming-kafka-0-8-assembly_2.11-2.2.0.jar myApp.py
    

【讨论】:

    【解决方案3】:

    classNotFoundException表示您的程序 spark-submit 正在运行时 在程序的运行目录中找不到它需要的类kafka.common.TopicAndPartition

    看看 spark-submit 命令的用法:

    # spark-submit --help
    Usage: spark-submit [options] <app jar | python file> [app arguments]
    Usage: spark-submit --kill [submission ID] --master [spark://...]
    Usage: spark-submit --status [submission ID] --master [spark://...]
    Usage: spark-submit run-example [options] example-class [example args]
    
    Options:
      --master MASTER_URL         spark://host:port, mesos://host:port, yarn, or         local.
      --deploy-mode DEPLOY_MODE   Whether to launch the driver program locally     ("client") or
                                  on one of the worker machines inside the cluster ("cluster")
                                  (Default: client).
      --class CLASS_NAME          Your application's main class (for Java / Scala apps).
      --name NAME                 A name of your application.
      --jars JARS                 Comma-separated list of local jars to include on the driver
                                  and executor classpaths.
      --packages                  Comma-separated list of maven coordinates of jars to include
                                  on the driver and executor classpaths. Will     search the local
                                  maven repo, then maven central and any additional remote
                                  repositories given by --repositories. The format for the
                                  coordinates should be groupId:artifactId:version.
    

    在kafka的本地jar路径中添加--jars选项,如下:

    # spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.2.0 --jars /path/to/org.apache.kafka_kafka_2.11-0.8.2.1.jar,/path/to/com.yammer.metrics_metrics-core-2.2.0.jar your_python_script.py.

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-04-07
      • 2022-10-26
      • 2017-02-16
      • 1970-01-01
      • 1970-01-01
      • 2018-01-18
      • 2018-05-26
      • 2023-03-12
      相关资源
      最近更新 更多