【发布时间】:2020-02-28 17:27:51
【问题描述】:
我的代码如下。我正在使用 spark UDF 将名为“IssueDate”的新列添加到现有数据框中,但出现空指针异常。所以任何关于这个的建议/建议都可以摆脱这个问题。
class IssueDateDateHandler(var masterDF) extends Serializable {
val getIssueDate:(String)=> Option[String] = {(Id) =>
Option(Id) match {
case Some(Id) => {
val matchingIdDF = masterDF.where(col("Id") === Id)
val issueDt = matchingIdDF.select("IssueDate").head().mkString
Option(issueDt)
}
case _ => Some("")
}
}
val issueDate = udf[Option[String], String](getIssueDate)
def addIssueDate(transformedDFs: MutableList[DataFrame]): MutableList[DataFrame] = {
for (tmpDF <- transformedDFs) {
val df = tmpDF.withColumn("IssueDate", issueDate(col("Id")))
}
}
}
【问题讨论】:
-
异常是: 由:java.lang.NullPointerException at org.apache.spark.sql.Dataset.
(Dataset.scala:182) at org.apache.spark.sql.Dataset $.apply(Dataset.scala:64) at org.apache.spark.sql.Dataset.withTypedPlan(Dataset.scala:3411) at org.apache.spark.sql.Dataset.filter(Dataset.scala:1484) at org .apache.spark.sql.Dataset.where(Dataset.scala:1512) 在 com.jetblue.revenueingest.transformations.IssueDateHandler$$anonfun$2.apply(IssueDateHandler.scala:28) 在 com.jetblue.revenueingest.transformations.IssueDateHandler $$anonfun$2.apply(IssueDateHandler.scala:25) ... 21 更多 -
请编辑问题而不是添加评论。
-
请添加错误日志
标签: scala apache-spark apache-spark-sql user-defined-functions