【发布时间】:2018-01-31 19:11:18
【问题描述】:
我看到了与这个问题相关的所有线索,他们都非常清楚海报是用两个版本的 Scala 交叉编译的。就我而言,我确保我只有一个 2.11 版本,但我仍然遇到同样的错误。任何帮助表示赞赏,谢谢。 我的 Spark 环境:
/___/ .__/\_,_/_/ /_/\_\ version 2.0.0.2.5.3.0-37
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_67)
我的 pom.xml:
<properties>
<spark.version>2.2.1</spark.version>
<scala.version>2.11.8</scala.version>
<scala.library.version>2.11.8</scala.library.version>
<scala.binary.version>2.11</scala.binary.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.source.version>1.7</java.source.version>
<java.compile.version>1.7</java.compile.version>
<kafka.version>0-10</kafka.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.typesafe.scala-logging</groupId>
<artifactId>scala-logging-slf4j_${scala.binary.version}</artifactId>
<version>2.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-${kafka.version}_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.library.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>0.11.0.2</version>
</dependency>
</dependencies>
这是个例外:
at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream$$anonfun$start$1.apply(DirectKafkaInputDStream.scala:246)
at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream$$anonfun$start$1.apply(DirectKafkaInputDStream.scala:245)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:45)
at scala.collection.SetLike$class.map(SetLike.scala:93)
at scala.collection.mutable.AbstractSet.map(Set.scala:45)
at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.start(DirectKafkaInputDStream.scala:245)
at org.apache.spark.streaming.DStreamGraph$$anonfun$start$5.apply(DStreamGraph.scala:47)
at org.apache.spark.streaming.DStreamGraph$$anonfun$start$5.apply(DStreamGraph.scala:47)
at scala.collection.parallel.mutable.ParArray$ParArrayIterator.foreach_quick(ParArray.scala:145)
at scala.collection.parallel.mutable.ParArray$ParArrayIterator.foreach(ParArray.scala:138)
at scala.collection.parallel.ParIterableLike$Foreach.leaf(ParIterableLike.scala:975)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56)
at scala.collection.parallel.ParIterableLike$Foreach.tryLeaf(ParIterableLike.scala:972)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514)
at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
当我在命令的输出中 grep "_2.1" 时:mvn dependency:tree -Dverbose 我没有看到对 2.10 的任何引用。
[INFO] +- org.apache.spark:spark-core_2.11:jar:2.2.1:compile
[INFO] | +- com.twitter:chill_2.11:jar:0.8.0:compile
[INFO] | +- org.apache.spark:spark-launcher_2.11:jar:2.2.1:compile
[INFO] | | +- (org.apache.spark:spark-tags_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | +- org.apache.spark:spark-network-common_2.11:jar:2.2.1:compile
[INFO] | +- org.apache.spark:spark-network-shuffle_2.11:jar:2.2.1:compile
[INFO] | | +- (org.apache.spark:spark-network-common_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | +- org.apache.spark:spark-unsafe_2.11:jar:2.2.1:compile
[INFO] | | +- (org.apache.spark:spark-tags_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | | +- (com.twitter:chill_2.11:jar:0.8.0:compile - omitted for duplicate)
[INFO] | +- org.json4s:json4s-jackson_2.11:jar:3.2.11:compile
[INFO] | | +- org.json4s:json4s-core_2.11:jar:3.2.11:compile
[INFO] | | | +- org.json4s:json4s-ast_2.11:jar:3.2.11:compile
[INFO] | | | +- org.scala-lang.modules:scala-xml_2.11:jar:1.0.1:compile
[INFO] | | | \- (org.scala-lang.modules:scala-parser-combinators_2.11:jar:1.0.1:compile - omitted for conflict with 1.0.4)
[INFO] | +- com.fasterxml.jackson.module:jackson-module-scala_2.11:jar:2.6.5:compile
[INFO] | +- org.apache.spark:spark-tags_2.11:jar:2.2.1:compile
[INFO] +- org.apache.spark:spark-sql_2.11:jar:2.2.1:compile
[INFO] | +- org.apache.spark:spark-sketch_2.11:jar:2.2.1:compile
[INFO] | | +- (org.apache.spark:spark-tags_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | +- (org.apache.spark:spark-core_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | +- org.apache.spark:spark-catalyst_2.11:jar:2.2.1:compile
[INFO] | | +- (org.apache.spark:spark-core_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | | +- (org.apache.spark:spark-tags_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | | +- (org.apache.spark:spark-unsafe_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | | +- (org.apache.spark:spark-sketch_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | +- (org.apache.spark:spark-tags_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] +- org.apache.spark:spark-hive_2.11:jar:2.2.1:compile
[INFO] | +- (org.apache.spark:spark-core_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | +- (org.apache.spark:spark-sql_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] +- com.typesafe.scala-logging:scala-logging-slf4j_2.11:jar:2.1.2:compile
[INFO] | +- com.typesafe.scala-logging:scala-logging-api_2.11:jar:2.1.2:compile
[INFO] +- org.apache.spark:spark-streaming-kafka-0-10_2.11:jar:2.2.1:compile
[INFO] | +- org.apache.kafka:kafka_2.11:jar:0.10.0.1:compile
[INFO] | | +- org.scala-lang.modules:scala-parser-combinators_2.11:jar:1.0.4:compile
[INFO] | +- (org.apache.spark:spark-tags_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] +- org.apache.spark:spark-streaming_2.11:jar:2.2.1:compile
[INFO] | +- (org.apache.spark:spark-core_2.11:jar:2.2.1:compile - omitted for duplicate)
[INFO] | +- (org.apache.spark:spark-tags_2.11:jar:2.2.1:compile - omitted for duplicate)
另外我应该声明我正在使用 Uber jar 使用 spark-submit 在 Spark 服务器中运行。优步包括下面的罐子。我将 scala jar 作为解决问题的最后一个资源包含在内,但是否这样做并不重要。
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.0</version>
<configuration>
<shadedArtifactAttached>false</shadedArtifactAttached>
<keepDependenciesWithProvidedScope>false</keepDependenciesWithProvidedScope>
<artifactSet>
<includes>
<include>org.apache.kafka:spark*</include>
<include>org.apache.spark:spark-streaming-kafka-${kafka.version}_${scala.binary.version}
</include>
<include>org.apache.kafka:kafka_${scala.binary.version}</include>
<include>org.apache.kafka:kafka-clients</include>
<include>org.apache.spark:*</include>
<include>org.scala-lang:scala-library</include>
</includes>
<excludes>
<exclude>org.apache.hadoop:*</exclude>
<exclude>com.fasterxml:*</exclude>
</excludes>
</artifactSet>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/services/javax.ws.rs.ext.Providers</resource>
</transformer>
</transformers>
</configuration>
<executions>
<execution>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
【问题讨论】:
-
如果您开始声明您使用 2.0.0,为什么还要使用 2.2.1 编译?
-
这是一个很好的问题,下一步我打算降级,但我不明白为什么会这样。我还编辑了我的问题,因为问题仅在我部署 uber jar 并使用 spar-submit 运行时发生。当然在 IntelliJ 中它运行良好。
标签: scala apache-spark spark-streaming