【发布时间】:2017-06-03 17:31:52
【问题描述】:
我尝试使用 Spark 执行一个简单的 Scala 脚本,如 Spark Quick Start Tutorial 中所述。我没有麻烦执行以下 Python 代码:
"""SimpleApp.py"""
from pyspark import SparkContext
logFile = "tmp.txt" # Should be some file on your system
sc = SparkContext("local", "Simple App")
logData = sc.textFile(logFile).cache()
numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()
print "Lines with a: %i, lines with b: %i" % (numAs, numBs)
我使用以下命令执行此代码:
/home/aaa/spark/spark-2.1.0-bin-hadoop2.7/bin/spark-submit hello_world.py
但是,如果我尝试使用 Scala 做同样的事情,我就会遇到技术问题。更详细地说,我尝试执行的代码是:
* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
val logFile = "tmp.txt" // Should be some file on your system
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
我尝试通过以下方式执行它:
/home/aaa/spark/spark-2.1.0-bin-hadoop2.7/bin/spark-submit hello_world.scala
结果我收到以下错误消息:
Error: Cannot load main class from JAR file
有人知道我做错了什么吗?
【问题讨论】:
-
在终端(命令行)中导航到 scala 文件所在的文件夹。然后运行“scalac YouClassName.scala”。最后通过运行命令执行它:“scala YourClassName”。顺便说一句,您需要在这些步骤之前安装 Scala :)
-
from JAR file在消息中是正确的,所以这就是你做错了 -
@AlexFruzenshtein,如果我执行
scalac hellow_world.scala,我会收到错误消息error: object apache is not a member of package org
标签: scala apache-spark