【发布时间】:2019-04-28 06:16:51
【问题描述】:
我需要为一个小型研究项目编写一些单元和集成测试。我正在使用一个简单的 Spark 应用程序,它从文件中读取数据并输出文件中的字符数。我正在使用 ScalaTest 编写单元测试。但是我想不出这个项目的集成测试。根据项目流程我需要执行单元测试,打包一个jar文件,然后使用这个jar文件执行集成测试。我有一个包含数据的文件作为测试的资源。那么我应该将此文件与源代码打包还是应该将其放在单独的位置?我可以为此应用程序编写哪些类型的集成测试?
简单的 Spark 应用程序如下所示:
object SparkExample {
def readFile(sparkContext: SparkContext, fileName: String) = {
sparkContext.textFile(fileName)
}
def mapStringToLength(data: RDD[String]) = {
data.map(fileData => fileData.length)
}
def printIntFileData(data: RDD[Int]) = {
data.foreach(fileString =>
println(fileString.toString)
)
}
def printFileData(data: RDD[String]) = {
data.foreach(fileString =>
println(fileString)
)
}
def main(args: Array[String]) {
val spark = SparkSession
.builder
.master("local[*]")
.appName("TestApp")
.getOrCreate()
val dataFromFile = readFile(spark.sparkContext, args(0))
println("\nAll the data:")
val dataToInt = mapStringToLength(dataFromFile)
printFileData(dataFromFile)
printIntFileData(dataToInt)
spark.stop()
}
}
我写的单元测试:
class SparkExampleTest extends FunSuite with BeforeAndAfter with Matchers{
val master = "local"
val appName = "TestApp"
var sparkContext: SparkContext = _
val fileContent = "This is the text only for the test purposes. There is no sense in it completely. This is the test of the Spark Application"
val fileName = "src/test/resources/test_data.txt"
val noPathFileName = "test_data.txt"
val errorFileName = "test_data1.txt"
before {
val sparkSession = SparkSession
.builder
.master(master)
.appName(appName)
.getOrCreate()
sparkContext = sparkSession.sparkContext
}
test("SparkExample.readFile"){
assert(SparkExample.readFile(sparkContext, fileName).collect() sameElements Array(fileContent))
}
test("SparkExample.mapStringToLength"){
val stringLength = fileContent.length
val rdd = sparkContext.makeRDD(Array(fileContent))
assert(SparkExample.mapStringToLength(rdd).collect() sameElements Array(stringLength))
}
test("SparkExample.mapStringToLength Negative"){
val stringLength = fileContent.length
val rdd = sparkContext.makeRDD(Array(fileContent + " "))
assert(SparkExample.mapStringToLength(rdd).collect() != Array(stringLength))
}
test("SparkExample.readFile does not throw Exception"){
noException should be thrownBy SparkExample.readFile(sparkContext, fileName).collect()
}
test("SparkExample.readFile throws InvalidInputException without filePath"){
an[InvalidInputException] should be thrownBy SparkExample.readFile(sparkContext, noPathFileName).collect()
}
test("SparkExample.readFile throws InvalidInputException with wrong filename"){
an[InvalidInputException] should be thrownBy SparkExample.readFile(sparkContext, errorFileName).collect()
}
}
【问题讨论】:
标签: scala apache-spark integration-testing scalatest