【问题标题】:Deploying Storm topology with Spark dependency使用 Spark 依赖部署 Storm 拓扑
【发布时间】:2025-12-04 11:20:03
【问题描述】:

我正在尝试部署具有 Spark 依赖项(版本 1.6.1)的 Storm 拓扑(版本 1.0.0)。该拓扑使用本地集群正常工作,但不将其提交到集群。我知道 Spark 和 Storm 需要与 log4j 相关的库。所以,如果pom文件修改为:

 <!-- Apache Spark -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>${spark.version}</version>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
            </exclusion>
            <exclusion>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
            </exclusion>
        </exclusions>
    </dependency>

出现此错误:

java.lang.NoSuchMethodError: org.apache.log4j.Logger.setLevel(Lorg/apache/log4j/Level;)V
at org.apache.spark.util.AkkaUtils$$anonfun$org$apache$spark$util$AkkaUtils$$doCreateActorSystem$1.apply(AkkaUtils.scala:75) ~[stormjar.jar:?]
at org.apache.spark.util.AkkaUtils$$anonfun$org$apache$spark$util$AkkaUtils$$doCreateActorSystem$1.apply(AkkaUtils.scala:75) ~[stormjar.jar:?]
at scala.Option.map(Option.scala:145) ~[stormjar.jar:?]
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:75) ~[stormjar.jar:?]
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53) ~[stormjar.jar:?]
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52) ~[stormjar.jar:?]
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1988) ~[stormjar.jar:?]
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) ~[stormjar.jar:?]
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1979) ~[stormjar.jar:?]
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55) ~[stormjar.jar:?]
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266) ~[stormjar.jar:?]
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193) ~[stormjar.jar:?]
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288) ~[stormjar.jar:?]
at org.apache.spark.SparkContext.<init>(SparkContext.scala:457) ~[stormjar.jar:?]
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59) ~[stormjar.jar:?]
at ufrn.imd.engsoft.storm.SentimentAnalyserBolt.prepare(SentimentAnalyserBolt.java:106) ~[stormjar.jar:?]
at org.apache.storm.daemon.executor$fn__8226$fn__8239.invoke(executor.clj:795) ~[storm-core-1.0.0.jar:1.0.0]
at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:482) [storm-core-1.0.0.jar:1.0.0]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_65]
2016-04-22 00:29:06.285 o.a.s.util [ERROR] Halting process: ("Worker died")

如果 Spark 依赖项中没有任何排除项,则会发生此错误:

java.lang.ExceptionInInitializerError
at org.apache.log4j.Logger.getLogger(Logger.java:39) ~[log4j-over-slf4j-1.6.6.jar:1.6.6]
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:75) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1988) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1979) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.SparkContext.<init>(SparkContext.scala:457) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59) ~[spark-assembly-1.6.1-hadoop2.6.0.jar:1.6.1]
at ufrn.imd.engsoft.storm.SentimentAnalyserBolt.prepare(SentimentAnalyserBolt.java:106) ~[stormjar.jar:?]
at org.apache.storm.daemon.executor$fn__8226$fn__8239.invoke(executor.clj:795) ~[storm-core-1.0.0.jar:1.0.0]
at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:482) [storm-core-1.0.0.jar:1.0.0]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_65]
Caused by: java.lang.IllegalStateException: Detected both log4j-over-slf4j.jar AND slf4j-log4j12.jar on the class path, preempting *Error. See also http://www.slf4j.org/codes.html#log4jDelegationLoop for more details.
at org.apache.log4j.Log4jLoggerFactory.<clinit>(Log4jLoggerFactory.java:49) ~[log4j-over-slf4j-1.6.6.jar:1.6.6]
... 18 more

在上述两种情况下,Storm 依赖项如下:

<!-- Apache Storm -->
    <dependency>
        <groupId>org.apache.storm</groupId>
        <artifactId>storm-core</artifactId>
        <version>${storm.version}</version>
        <scope>provided</scope>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
            </exclusion>
            <exclusion>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
            </exclusion>
        </exclusions>
    </dependency>

Storm 文件夹包含这些 jar:

- log4-api-2.1
- log4j-core-2.1
- log4j-over-sl4j-1.6.6
- log4j-sl4j-impl-2.1
- sl4j-api-1.7.7
- sl4j-log4j12

也许 Spark 能够得到上面的罐子可以解决这个问题,但我没有找到这样做的线索。有人可以帮助我解决这个问题吗?

谢谢(:!

【问题讨论】:

    标签: apache-spark log4j slf4j apache-storm


    【解决方案1】:

    log4j只定义了一个日志接口和log4j-over-slf4jslf4j-log4j12以及这个接口的具体实现。当 Storm 找到这两种实现时,不知道使用哪一种(如果您不排除任何东西)。

    但是,在您的“排除”情况下,您排除了接口和具体实现,因此您会收到有关缺少接口的错误。尽量只排除实现而不排除接口。

    【讨论】:

    • 我尝试仅排除实现,但同样的错误仍然存​​在。 2016-04-25 08:24:05.216 STDIO [ERROR] SLF4J: Detected both log4j-over-slf4j.jar AND slf4j-log4j12.jar on the class path, preempting *Error. 2016-04-25 08:24:05.218 STDIO [ERROR] SLF4J: See also http://www.slf4j.org/codes.html#log4jDelegationLoop for more details. 2016-04-25 08:24:05.221 o.a.s.util [ERROR] Async loop died! java.lang.ExceptionInInitializerError at org.apache.log4j.Logger.getLogger(Logger.java:39) ~[log4j-over-slf4j-1.6.6.jar:1.6.6]。我认为应该使用storm文件夹中的库。
    • 也许 pom 是正确的,因为我正在将拓扑提交到远程风暴集群,因此正在使用风暴文件夹中的库。但是,由于某种原因,Spark(本地模式)没有使用该库。我也尝试使用火花簇。