【问题标题】:Pyspark implementation of DATEADDDATEADD 的 Pyspark 实现
【发布时间】:2019-02-22 11:17:30
【问题描述】:

我的 T-SQL 代码如下所示

cast( dateadd(minute, - 240, tmp_view_tos_lenelgate_qry11.eventdate) as date

如何在PYSPARK中实现DATE_ADD功能?

【问题讨论】:

    标签: tsql apache-spark pyspark pyspark-sql


    【解决方案1】:
    # Creating the DataFrame
    df = spark.createDataFrame([('2014-02-13 12:36:52.721',),('2018-01-01 00:30:50.001',)], ['eventdate'])
    df = df.withColumn('eventdate', col('eventdate').cast('timestamp'))
    df.show(truncate=False)
    +-----------------------+
    |eventdate              |
    +-----------------------+
    |2014-02-13 12:36:52.721|
    |2018-01-01 00:30:50.001|
    +-----------------------+
    df.printSchema()
    root
     |-- eventdate: timestamp (nullable = true)
    
    # Subtract 240 minutes/240*60=14400 seconds from 'eventdate'
    from pyspark.sql.functions import col, unix_timestamp
    df = df.withColumn('eventdate_new', (unix_timestamp('eventdate') - 240*60).cast('timestamp'))
    df.show(truncate=False)
    +-----------------------+-------------------+
    |eventdate              |eventdate_new      |
    +-----------------------+-------------------+
    |2014-02-13 12:36:52.721|2014-02-13 08:36:52|
    |2018-01-01 00:30:50.001|2017-12-31 20:30:50|
    +-----------------------+-------------------+
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-01-12
      • 1970-01-01
      • 1970-01-01
      • 2019-07-15
      • 2020-04-10
      • 1970-01-01
      相关资源
      最近更新 更多