【问题标题】:Hive Date String ValidationHive 日期字符串验证
【发布时间】:2017-10-06 16:22:19
【问题描述】:

我正在尝试检查字符串是否为有效日期格式“YYYYMMDD”。

我正在使用以下技术。但是对于无效的日期字符串,我得到了有效的日期结果。

我做错了什么?

SELECT'20019999',CASE WHEN unix_timestamp('20019999','YYYYMMDD')  > 0 THEN  'Good'ELSE 'Bad'END;

【问题讨论】:

    标签: hive hql unix-timestamp


    【解决方案1】:

    首先,你使用了错误的格式

    select  from_unixtime(unix_timestamp())                 as default_format
           ,from_unixtime(unix_timestamp(),'YYYY-MM-DD')    as wrong_format
           ,from_unixtime(unix_timestamp(),'yyyy-MM-dd')    as right_format
    ;
    

    +----------------------+---------------+---------------+
    |    default_format    | wrong_format  | right_format  |
    +----------------------+---------------+---------------+
    | 2017-10-07 04:13:26  | 2017-10-280   | 2017-10-07    |
    +----------------------+---------------+---------------+
    

    其次,没有对日期部分范围进行验证。
    如果您将日期部分增加 1,它会将您转发到第二天。

    with t as (select stack(7,'27','28','29','30','31','32','33') as dy)
    select  t.dy
           ,from_unixtime(unix_timestamp(concat('2017-02-',t.dy),'yyyy-MM-dd'),'yyyy-MM-dd') as dt
    
    from    t
    ;
    

    +-----+-------------+
    | dy  |     dt      |
    +-----+-------------+
    | 27  | 2017-02-27  |
    | 28  | 2017-02-28  |
    | 29  | 2017-03-01  |
    | 30  | 2017-03-02  |
    | 31  | 2017-03-03  |
    | 32  | 2017-03-04  |
    | 33  | 2017-03-05  |
    +-----+-------------+
    

    如果您将月份部分增加 1,则会将您转到下个月。

    with t as (select stack(5,'10','11','12','13','14') as mn)
    select  t.mn
           ,from_unixtime(unix_timestamp(concat('2017-',t.mn,'-01'),'yyyy-MM-dd'),'yyyy-MM-dd') as dt
    
    from    t
    ;
    

    +-----+-------------+
    | mn  |     dt      |
    +-----+-------------+
    | 10  | 2017-10-01  |
    | 11  | 2017-11-01  |
    | 12  | 2017-12-01  |
    | 13  | 2018-01-01  |
    | 14  | 2018-02-01  |
    +-----+-------------+
    

    即使使用 CAST,验证也仅针对零件范围而不是日期本身进行。

    select cast('2010-02-32' as date);
    

    +-------+
    |  _c0  |
    +-------+
    | NULL  |
    +-------+
    

    select cast('2010-02-29' as date);
    

    +-------------+
    |     _c0     |
    +-------------+
    | 2010-03-01  |
    +-------------+
    

    以下是实现目标的方法:

    with t as (select '20019999' as dt)
    select  dt  
           ,from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') as double_converted_dt    
    
           ,case 
                when from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd')  = dt 
                then 'Good' 
                else 'Bad' 
            end             as dt_status
    
    from    t
    ;
    

    +-----------+----------------------+------------+
    |    dt     | double_converted_dt  | dt_status  |
    +-----------+----------------------+------------+
    | 20019999  | 20090607             | Bad        |
    +-----------+----------------------+------------+
    

    【讨论】:

    • @jenesaisquoi,太棒了:-)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2011-12-21
    • 1970-01-01
    • 2014-12-19
    • 1970-01-01
    • 1970-01-01
    • 2012-11-06
    • 2021-01-21
    相关资源
    最近更新 更多