【问题标题】:SparkR Null Pointer Exception when trying to create a data frame尝试创建数据框时的 SparkR 空指针异常
【发布时间】:2015-12-06 15:43:45
【问题描述】:

尝试在 sparkR 中创建数据框时,我收到有关空指针异常的错误。我已经粘贴了我的代码,以及下面的错误消息。我是否需要安装更多软件包才能运行此代码?

代码

SPARK_HOME <- "C:\\Users\\erer\\Downloads\\spark-1.5.2-bin-hadoop2.4\\spark-1.5.2-bin-hadoop2.4"
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.2.0" "sparkr-shell"')
library(SparkR, lib.loc = "C:\\Users\\erer\\Downloads\\spark-1.5.2-bin-hadoop2.4\\R\\lib")
    library(SparkR)
        library(rJava)

        sc <- sparkR.init(master = "local", sparkHome = SPARK_HOME)
        sqlContext <- sparkRSQL.init(sc)

        localDF <- data.frame(name=c("John", "Smith", "Sarah"), age=c(19, 23, 18))
        df <- createDataFrame(sqlContext, localDF)

错误:

Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, localhost): java.lang.NullPointerException

        at java.lang.ProcessBuilder.start(Unknown Source)

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)

        at org.apache.hadoop.util.Shell.run(Shell.java:418)

        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)

        at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)

        at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)

        at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381)

        at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:405)

        at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:397)

        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:7

【问题讨论】:

    标签: r apache-spark sparkr


    【解决方案1】:

    您需要将库 SparkR 指向本地 SparkR 代码所在的目录,在 lib.loc 参数中指定(如果您下载了 Spark 二进制文件,SPARK_HOME/R/lib 将已经为您填充):

    `library(SparkR, lib.loc = "/home/kris/spark/spark-1.5.2-bin-hadoop2.6/R/lib")`
    

    有关如何从 Rstudio 运行 Spark,另请参阅 R-bloggers 上的本教程:http://www.r-bloggers.com/sparkr-with-rstudio-in-ubuntu-12-04/

    【讨论】:

    • 我已经设置好了路径。问题在于空指针异常。
    • 我还是有同样的问题
    猜你喜欢
    • 1970-01-01
    • 2014-10-29
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多