【发布时间】:2017-04-26 15:53:56
【问题描述】:
希望使用 sparklyr Spark 安装中包含的自定义 scalac(Scala 编译器);在 RStudio SparkUI 选项卡中找到(或来自 spark_web(sc))>> 环境 >> /jars/scala-compiler-2.11.8.jar 作为“系统环境”——而不是在基本目录中单独下载和安装 scalac——如 the "hello world" example found here 中所建议的那样并从 RStudio 关于创建扩展的页面链接http://spark.rstudio.com/extensions.html。
这是我目前使用的 Ubuntu,但在下面的错误中停滞不前。我设置了一个与上面“hello world”示例中使用的 Github-repo 完全相同的目录。知道如何在不安装在建议的基本路径文件夹之一(即/opt/scala、/opt/local/scala、/usr/local/scala 或~/scala(仅限 Windows)中)的情况下克服此错误?想要为给定用户使用sparklyr 本机安装和相对路径。
library(titanic)
library(sparklyr)
# spark_web(sc) # Opens Web Console to find Scala Version and scalac
# Sets Working Directory to R folder of file
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
sparkVers <- '2.0.0'; scalaVers <- '2.11.8'; packageName <- "sparkhello"
packageJarExtR <- spark_compilation_spec(spark_version = sparkVers,
spark_home = spark_home_dir(),
scalac_path = paste0(spark_home_dir(),"/jars","/scala-compiler-", scalaVers, ".jar"), #
scala_filter = NULL,
jar_name = sprintf(paste0(getwd(),"/inst/java/", packageName, "-%s-%s.jar"), sparkVers, scalaVers)
)
sparklyr::compile_package_jars(spec = packageJarExtR)
# Error: No root directory found. Test criterion:
# Contains a file 'DESCRIPTION' with contents matching '^Package: '
# In addition: Warning message:
# running command ''/mnt/home/eyeOfTheStorm/.cache/spark/
# spark-2.0.0-bin-hadoop2.7/jars/scala-compiler-2.11.8.jar'
# -version 2>&1' had status 126
###
library(sparkhello)
# Connect to local spark cluster and load data
sc <- spark_connect(master = "local", version = "2.0.0")
titanic_tbl <- copy_to(sc, titanic_train, "titanic", overwrite = TRUE)
【问题讨论】:
标签: r scala apache-spark rstudio sparklyr