【问题标题】:How to group by week Cloudera impala如何按周分组 Cloudera impala
【发布时间】:2014-09-01 03:51:00
【问题描述】:

如何按周对 Impala 查询结果进行分组?数据如下:

    userguid                 eventtime
0   66AB1405446C74F2992016E5 2014-08-01T16:43:05Z
1   66AB1405446C74F2992016E5 2014-08-02T20:12:12Z
2   4097483F53AB3C170A490D44 2014-08-03T18:08:50Z
3   4097483F53AB3C170A490D44 2014-08-04T18:10:08Z
4   4097483F53AB3C170A490D44 2014-08-05T18:14:51Z
5   4097483F53AB3C170A490D44 2014-08-06T18:15:29Z
6   4097483F53AB3C170A490D44 2014-08-07T18:17:15Z
7   4097483F53AB3C170A490D44 2014-08-08T18:18:09Z
8   4097483F53AB3C170A490D44 2014-08-09T18:18:18Z
9   4097483F53AB3C170A490D44 2014-08-10T18:23:30Z

预期结果是:

date                    count of different userguid
2014-08-01~2014-08-07   40
2014-08-08~2014-08-15   20
2014-08-16~2014-08-23   10

谢谢。

【问题讨论】:

    标签: cloudera impala


    【解决方案1】:

    如果eventtime 存储为timestamp

    SELECT TRUNC(eventtime, "D"), COUNT(DISTINCT userguid)
    FROM your_table
    GROUP BY TRUNC(eventtime, "D")
    ORDER BY TRUNC(eventtime, "D");
    

    否则,如果eventtime 存储为string

    SELECT TRUNC(CAST(eventtime AS TIMESTAMP), "D"), COUNT(DISTINCT userguid)
    FROM your_table
    GROUP BY TRUNC(CAST(eventtime AS TIMESTAMP), "D")
    ORDER BY TRUNC(CAST(eventtime AS TIMESTAMP), "D");
    

    有关TRUNC 函数的更多信息,请参阅Cloudera Impala documentation on Date and Time Functions

    【讨论】:

    • 你能解释一下答案吗?下周周日到周六的分组怎么做?
    【解决方案2】:

    在 Impala 中,TRUNC(timestamp, "D") 表示查找一周的开始日期。您可以查看 Impala 日期和时间函数 here

    例如:

    select trunc(cast('2016-11-10' as timestamp), "D")
    +---------------------------------------------+
    | trunc(cast('2016-11-10' as timestamp), 'd') |
    +---------------------------------------------+
    | 2016-11-07 00:00:00                         |
    +---------------------------------------------+
    
    +---------------------------------------------+
    | trunc(cast('2016-11-09' as timestamp), 'd') |
    +---------------------------------------------+
    | 2016-11-07 00:00:00                         |
    +---------------------------------------------+
    
    +---------------------------------------------+
    | trunc(cast('2016-11-11' as timestamp), 'd') |
    +---------------------------------------------+
    | 2016-11-07 00:00:00                         |
    +---------------------------------------------+
    

    【讨论】:

      猜你喜欢
      • 2018-05-12
      • 1970-01-01
      • 2017-09-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多