【发布时间】:2016-05-14 18:24:59
【问题描述】:
这是一个基本问题,但是,我正在尝试使用 Scala 中的以下代码在来自 Apache Spark 服务的分析的 Bluemix 笔记本中检索文件的内容,并且不断弹出有关身份验证的错误。有人有访问文件的 Scala 身份验证示例吗?提前谢谢!
我尝试了以下简单的脚本:
val file = sc.textFile("swift://notebooks.keystone/kdd99.data")
file.take(1)
我也试过了:
def setConfig(name:String) : Unit = {
val pfx = "fs.swift.service." + name
val conf = sc.getConf
conf.set(pfx + "auth.url", "hardcoded")
conf.set(pfx + "tenant", "hardcoded")
conf.set(pfx + "username", "hardcoded")
conf.set(pfx + "password", "hardcoded")
conf.set(pfx + "apikey", "hardcoded")
conf.set(pfx + "auth.endpoint.prefix", "endpoints")
}
setConfig("keystone")
我也从上一个问题中尝试过这个脚本:
import scala.collection.breakOut
val name= "keystone"
val YOUR_DATASOURCE = """auth_url:https://identity.open.softlayer.com
project: hardcoded
project_id: hardcoded
region: hardcoded
user_id: hardcoded
domain_id: hardcoded
domain_name: hardcoded
username: hardcoded
password: hardcoded
filename: hardcoded
container: hardcoded
tenantId: hardcoded
"""
val settings:Map[String,String] = YOUR_DATASOURCE.split("\\n").
map(l=>(l.split(":",2)(0).trim(), l.split(":",2)(1).trim()))(breakOut)
val conf = sc.getConf conf.set("fs.swift.service.keystone.auth.url",settings.getOrElse("auth_url",""))
conf.set("fs.swift.service.keystone.tenant", settings.getOrElse("tenantId", ""))
conf.set("fs.swift.service.keystone.username", settings.getOrElse("username", ""))
conf.set("fs.swift.service.keystone.password", settings.getOrElse("password", ""))
conf.set("fs.swift.service.keystone.apikey", settings.getOrElse("password", ""))
conf.set("fs.swift.service.keystone.auth.endpoint.prefix", "endpoints")
println("sett: "+ settings.getOrElse("auth_url",""))
val file = sc.textFile("swift://notebooks.keystone/kdd99.data")
/* The following line gives errors */
file.take(1)
错误如下:
名称:org.apache.hadoop.fs.swift.exceptions.SwiftConfigurationException 消息:缺少强制配置选项:fs.swift.service.keystone.auth.url
编辑
这对于 Python 来说是一个不错的选择。我尝试了以下方法,将“spark”作为两个不同文件的配置名称:
def set_hadoop_config(credentials):
prefix = "fs.swift.service." + credentials['name']
hconf = sc._jsc.hadoopConfiguration()
hconf.set(prefix + ".auth.url", credentials['auth_url']+'/v3/auth/tokens')
hconf.set(prefix + ".auth.endpoint.prefix", "endpoints")
hconf.set(prefix + ".tenant", credentials['project_id'])
hconf.set(prefix + ".username", credentials['user_id'])
hconf.set(prefix + ".password", credentials['password'])
hconf.setInt(prefix + ".http.port", 8080)
hconf.set(prefix + ".region", credentials['region'])
hconf.setBoolean(prefix + ".public", True)
【问题讨论】:
标签: scala apache-spark ibm-cloud