【问题标题】:mongodb spark connector issuemongodb spark连接器问题
【发布时间】:2017-12-26 22:00:19
【问题描述】:

我是 mongodb 的新手。我正在尝试从 mongodb 中提取数据作为 Spark Dataframe。

我正在使用用于 Spark 的 MongoDB 连接器
链接:https://docs.mongodb.com/spark-connector/master/

我正在按照这个网站的步骤进行操作:https://docs.mongodb.com/spark-connector/master/scala/datasets-and-sql/
程序编译成功,但出现以下运行时错误:

Exception in thread "main" java.lang.NoClassDefFoundError: com/mongodb/ConnectionString
at com.mongodb.spark.config.MongoCompanionConfig$$anonfun$4.apply(MongoCompanionConfig.scala:278)
at com.mongodb.spark.config.MongoCompanionConfig$$anonfun$4.apply(MongoCompanionConfig.scala:278)
at scala.util.Try$.apply(Try.scala:192)
at com.mongodb.spark.config.MongoCompanionConfig$class.connectionString(MongoCompanionConfig.scala:278)
at com.mongodb.spark.config.ReadConfig$.connectionString(ReadConfig.scala:39)
at com.mongodb.spark.config.ReadConfig$.apply(ReadConfig.scala:51)
at com.mongodb.spark.config.ReadConfig$.apply(ReadConfig.scala:39)
at com.mongodb.spark.config.MongoCompanionConfig$class.apply(MongoCompanionConfig.scala:124)
at com.mongodb.spark.config.ReadConfig$.apply(ReadConfig.scala:39)
at com.mongodb.spark.config.MongoCompanionConfig$class.apply(MongoCompanionConfig.scala:113)
at com.mongodb.spark.config.ReadConfig$.apply(ReadConfig.scala:39)
at com.mongodb.spark.sql.DefaultSource.createRelation(DefaultSource.scala:67)
at com.mongodb.spark.sql.DefaultSource.createRelation(DefaultSource.scala:50)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:307)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)
at ScalaDemo.HelloWorld$.main(HelloWorld.scala:25)
at ScalaDemo.HelloWorld.main(HelloWorld.scala)
Caused by: java.lang.ClassNotFoundException: com.mongodb.ConnectionString
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 18 more


下面是maven sn -p

<dependencies>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.11</artifactId>
    <version>2.2.1</version>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
    <version>2.2.1</version>
</dependency>
<dependency>
    <groupId>org.mongodb.spark</groupId>
    <artifactId>mongo-spark-connector_2.11</artifactId>
    <version>2.2.1</version>
</dependency>

代码: 包ScalaDemo

import com.mongodb.spark._
import com.mongodb.spark.config._

object HelloWorld {
def main(args: Array[String]): Unit = {
import org.apache.spark.sql.SparkSession

val spark = SparkSession.builder()
  .master("local")
  .appName("MongoSparkConnectorIntro")
  .config("spark.mongodb.input.uri", "mongodb://localhost/admin.partnerCompanies")
  .config("spark.mongodb.output.uri", "mongodb://localhost/admin.partnerCompanies")
  .getOrCreate()
 val df1= spark.read.format("com.mongodb.spark.sql").load()
 df1.show()
  }
}

请帮忙

【问题讨论】:

    标签: mongodb apache-spark apache-spark-sql


    【解决方案1】:

    看起来和spark无关,你的例外是

    Exception in thread "main" java.lang.NoClassDefFoundError:com/mongodb/ConnectionString

    意味着它找不到连接到 mongo 的类。 尝试添加 mongo UberJar

    <dependencies>
        <dependency>
            <groupId>org.mongodb</groupId>
            <artifactId>mongo-java-driver</artifactId>
            <version>3.0.4</version>
        </dependency>
    </dependencies>
    

    【讨论】:

    • 谢谢.. 我试过了,但得到了同样的错误.. 你能建议任何其他方式通过 spark 连接到 mongodb 吗?
    • 你能把你所有的依赖贴在这里吗?如果你最终要创建一个 uber jar - 运行 jar tvf | grep ConnectionString 并告诉我你是否在 uber jar 中看到这个类
    • 成功了。上述解决方案是正确的,我将版本与我拥有的 mongodb 版本相匹配,即 3.6.0。它工作正常。谢谢
    猜你喜欢
    • 2018-11-28
    • 1970-01-01
    • 2017-10-27
    • 2017-08-13
    • 1970-01-01
    • 1970-01-01
    • 2022-08-16
    • 1970-01-01
    • 2016-08-11
    相关资源
    最近更新 更多