所以最后我使用 helm 让一切正常工作,spark-on-k8s-operator 和 sbt-docker
首先,我将一些配置提取到 build.sbt 中的变量中,以便程序集和 docker 生成器都可以使用它们。
// define some dependencies that should not be compiled, but copied into docker
val externalDependencies = Seq(
"org.postgresql" % "postgresql" % postgresVersion,
"io.prometheus.jmx" % "jmx_prometheus_javaagent" % jmxPrometheusVersion
)
// Settings
val team = "hazelnut"
val importerDescription = "..."
val importerMainClass = "..."
val targetDockerJarPath = "/opt/spark/jars"
val externalPaths = externalDependencies.map(module => {
val parts = module.toString().split(""":""")
val orgDir = parts(0).replaceAll("""\.""","""/""")
val moduleName = parts(1).replaceAll("""\.""","""/""")
val version = parts(2)
var jarFile = moduleName + "-" + version + ".jar"
(orgDir, moduleName, version, jarFile)
})
接下来我定义程序集设置以创建 fat jar(可以是你需要的任何东西):
lazy val assemblySettings = Seq(
// Assembly options
assembly / assemblyOption := (assemblyOption in assembly).value.copy(includeScala = false),
assembly / assemblyMergeStrategy := {
case PathList("reference.conf") => MergeStrategy.concat
case PathList("META-INF", _@_*) => MergeStrategy.discard
case "log4j.properties" => MergeStrategy.concat
case _ => MergeStrategy.first
},
assembly / logLevel := sbt.util.Level.Error,
assembly / test := {},
pomIncludeRepository := { _ => false }
)
然后定义docker设置:
lazy val dockerSettings = Seq(
imageNames in docker := Seq(
ImageName(s"$team/${name.value}:latest"),
ImageName(s"$team/${name.value}:${version.value}"),
),
dockerfile in docker := {
// The assembly task generates a fat JAR file
val artifact: File = assembly.value
val artifactTargetPath = s"$targetDockerJarPath/$team-${name.value}.jar"
externalPaths.map {
case (extOrgDir, extModuleName, extVersion, jarFile) =>
val url = List("https://repo1.maven.org/maven2", extOrgDir, extModuleName, extVersion, jarFile).mkString("/")
val target = s"$targetDockerJarPath/$jarFile"
Instructions.Run.exec(List("curl", url, "--output", target, "--silent"))
}
.foldLeft(new Dockerfile {
// https://hub.docker.com/r/lightbend/spark/tags
from(s"lightbend/spark:${openShiftVersion}-OpenShift-${sparkVersion}-ubuntu-${scalaBaseVersion}")
}) {
case (df, run) => df.addInstruction(run)
}.add(artifact, artifactTargetPath)
}
)
我创建了一些Task 来生成一些掌舵图表/值:
lazy val createImporterHelmChart: Def.Initialize[Task[Seq[File]]] = Def.task {
val chartFile = baseDirectory.value / "../helm" / "Chart.yaml"
val valuesFile = baseDirectory.value / "../helm" / "values.yaml"
val jarDependencies = externalPaths.map {
case (_, extModuleName, _, jarFile) =>
extModuleName -> s""""local://$targetDockerJarPath/$jarFile""""
}.toMap
val chartContents =
s"""# Generated by build.sbt. Please don't manually update
|apiVersion: v1
|name: $team-${name.value}
|version: ${version.value}
|description: $importerDescription
|""".stripMargin
val valuesContents =
s"""# Generated by build.sbt. Please don't manually update
|version: ${version.value}
|sparkVersion: $sparkVersion
|image: $team/${name.value}:${version.value}
|jar: local://$targetDockerJarPath/$team-${name.value}.jar
|mainClass: $importerMainClass
|jarDependencies: [${jarDependencies.values.mkString(", ")}]
|fileDependencies: []
|jmxExporterJar: ${jarDependencies.getOrElse("jmx_prometheus_javaagent", "null").replace("local://","")}
|""".stripMargin
IO.write(chartFile, chartContents)
IO.write(valuesFile, valuesContents)
Seq(chartFile, valuesFile)
}
最后,这一切都在 build.sbt 中组合成一个项目定义
lazy val importer = (project in file("importer"))
.enablePlugins(JavaAppPackaging)
.enablePlugins(sbtdocker.DockerPlugin)
.enablePlugins(AshScriptPlugin)
.dependsOn(util)
.settings(
commonSettings,
testSettings,
assemblySettings,
dockerSettings,
scalafmtSettings,
name := "etl-importer",
Compile / mainClass := Some(importerMainClass),
Compile / resourceGenerators += createImporterHelmChart.taskValue
)
最后加上每个环境的值文件和一个 helm 模板:
apiVersion: sparkoperator.k8s.io/v1beta1
kind: SparkApplication
metadata:
name: {{ .Chart.Name | trunc 64 }}
labels:
name: {{ .Chart.Name | trunc 63 | quote }}
release: {{ .Release.Name | trunc 63 | quote }}
revision: {{ .Release.Revision | quote }}
sparkVersion: {{ .Values.sparkVersion | quote }}
version: {{ .Chart.Version | quote }}
spec:
type: Scala
mode: cluster
image: {{ .Values.image | quote }}
imagePullPolicy: {{ .Values.imagePullPolicy }}
mainClass: {{ .Values.mainClass | quote }}
mainApplicationFile: {{ .Values.jar | quote }}
sparkVersion: {{ .Values.sparkVersion | quote }}
restartPolicy:
type: Never
deps:
{{- if .Values.jarDependencies }}
jars:
{{- range .Values.jarDependencies }}
- {{ . | quote }}
{{- end }}
{{- end }}
...
我现在可以使用
构建包
sbt [project name]/docker
并使用
部署它们
helm install ./helm -f ./helm/values-minikube.yaml --namespace=[ns] --name [name]
它可能会变得更漂亮,但现在它就像一个魅力