【问题标题】:Problems loading SparkContext() in Spyder (for Python)在 Spyder 中加载 SparkContext() 时出现问题(对于 Python)
【发布时间】:2015-08-21 04:42:58
【问题描述】:

我正在尝试通过 Spyder 加载 SparkContext。为了在 Spyder 中加载 Pyspark,我将以下命令发送到 Windows 命令行:

spark-submit.cmd C:\WinPython-64bit-2.7.9.5\python-2.7.9.amd64\Scripts\spyder.py

有效,我可以在 Spyder 中成功导入 pyspark。但是,每当我尝试创建 SparkContext() 的实例时,都会出现以下错误:

sc=pyspark.SparkContext()
Traceback (most recent call last):

  File "<ipython-input-3-e5c2c851d239>", line 1, in <module>
sc=pyspark.SparkContext()

  File "C:\spark-1.4.1-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\context.py", line 113, in __init__
conf, jsc, profiler_cls)

  File "C:\spark-1.4.1-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\context.py", line 165, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)

  File "C:\spark-1.4.1-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\context.py", line 219, in _initialize_context
return self._jvm.JavaSparkContext(jconf)

  File "C:\spark-1.4.1-bin-hadoop2.6\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py", line 701, in __call__
self._fqn)

  File "C:\spark-1.4.1-bin-hadoop2.6\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py", line 300, in get_return_value
format(target_id, '.', name), value)

Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
    at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
    at org.apache.spark.util.Utils$.fetchFile(Utils.scala:465)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1351)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1305)
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458)
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:458)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
    at py4j.Gateway.invoke(Gateway.java:214)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
    at py4j.GatewayConnection.run(GatewayConnection.java:207)
    at java.lang.Thread.run(Thread.java:744)

谁能帮忙解释Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.lang.NullPointerException的意思?

【问题讨论】:

    标签: python apache-spark pyspark spyder


    【解决方案1】:

    我也有同样的问题。似乎 Hadoop 会寻找文件,即使它们不存在。这个链接解释了如何创建一个 winutils 文件夹、文件和 HADOOP_HOME 变量,让我的 pyspark 在 win 7 上通过 spyder 运行:

    submit .py script on Spark without Hadoop installation

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-06-26
      • 2022-01-13
      • 1970-01-01
      相关资源
      最近更新 更多