【发布时间】:2021-11-19 13:57:18
【问题描述】:
使用spark submit时出现以下错误...否则pyspark运行良好:
: java.lang.ClassNotFoundException: 找不到数据源: com.mongodb.spark.sql.DefaultSource。请在以下位置找到包裹 http://spark.apache.org/third-party-projects.html 在 org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:679) 在 org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:733) 在 org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:967) 在 org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:304) 在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 在 java.lang.reflect.Method.invoke(Method.java:498)
这是我正在运行的代码:
from pyspark.sql import SparkSession
spark = (
SparkSession.builder.appName("myApp")
.config("spark.mongodb.input.uri", "mongodb://127.0.0.1/test.coll")
.config("spark.mongodb.output.uri", "mongodb://127.0.0.1/test.coll")
.config("spark.jars.packages", "org.mongodb.spark:mongo-spark-connector_2.11:2.3.2")
.getOrCreate()
)
people = spark.createDataFrame(
[
("Bilbo Baggins", 50),
("Gandalf", 1000),
("Thorin", 195),
("Balin", 178),
("Kili", 77),
("Dwalin", 169),
("Oin", 167),
("Gloin", 158),
("Fili", 82),
("Bombur", None),
],
["name", "age"],
)
people.write.format("mongo").mode("append").save()
【问题讨论】:
标签: mongodb apache-spark pyspark spark-submit