【发布时间】:2019-09-13 16:32:02
【问题描述】:
为什么:
import spark.implicits._
val content = Seq(("2019", "09", "11","17","16","54","762000000")).toDF("year", "month", "day", "hour", "minute", "second", "nano")
content.printSchema
content.show
content.withColumn("event_time_utc", to_timestamp(concat('year, 'month, 'day, 'hour, 'minute, 'second), "yyyyMMddHHmmss"))
.withColumn("event_time_utc_millis", to_timestamp(concat('year, 'month, 'day, 'hour, 'minute, 'second, substring('nano, 0, 3)), "yyyyMMddHHmmssSSS"))
.select('year, 'month, 'day, 'hour, 'minute, 'second, 'nano,substring('nano, 0, 3), 'event_time_utc, 'event_time_utc_millis)
.show
错过毫秒?
+----+-----+---+----+------+------+---------+---------------------+-------------------+---------------------+
|year|month|day|hour|minute|second| nano|substring(nano, 0, 3)| event_time_utc|event_time_utc_millis|
+----+-----+---+----+------+------+---------+---------------------+-------------------+---------------------+
|2019| 09| 11| 17| 16| 54|762000000| 762|2019-09-11 17:16:54| 2019-09-11 17:16:54|
+----+-----+---+----+------+------+---------+---------------------+-------------------+---------------------+
格式字符串为:yyyyMMddHHmmssSSS,如果我没记错的话,应该包括SSS 中的毫秒数。
【问题讨论】:
-
你的 spark 版本是什么
-
2.2.2 是我的 spark 版本。
-
好的,是的,在 spark
标签: apache-spark apache-spark-sql timestamp milliseconds format-string