【问题标题】:SQL: Group by date and summing values in a columnSQL:按日期分组并对列中的值求和
【发布时间】:2015-05-07 18:46:30
【问题描述】:

我有一个 JDBC 数据库(特别是 DB2,但我正在寻找与 DB 无关的东西,至少是 DB2 和 Oracle),它有一个表,每 10 分钟获取插入记录,其中包含应用程序运行的 API 的统计信息问题。它看起来像:

StatKey, StartDate, EndDate, APIName, StatName, StatValue
201505071498224437562706    2015-05-07 14:12:44.0   2015-05-07 14:22:44.0   API5    Invocations 34
201505071498161437466684    2015-05-07 14:06:14.0   2015-05-07 14:16:14.0   API4    Invocations 79
201505071498060937466556    2015-05-07 13:56:08.0   2015-05-07 14:06:08.0   API4    Average 26,264.37
201505071497263437627286    2015-05-07 14:16:33.0   2015-05-07 14:26:34.0   API2    Invocations 24
201505071497262137620812    2015-05-07 14:16:19.0   2015-05-07 14:26:20.0   API2    Invocations 24
201505071497024537466378    2015-05-07 13:52:43.0   2015-05-07 14:02:44.0   API1    Average 6,830,050
201505071497023337466368    2015-05-07 13:52:31.0   2015-05-07 14:02:32.0   API3    Average 31,523
201505071496023337466361    2015-05-07 13:52:31.0   2015-05-07 14:02:32.0   API2    Invocations 1
201505071494263837628892    2015-05-07 14:16:36.0   2015-05-07 14:26:37.0   API5    Invocations 68
201505071493124437466656    2015-05-07 14:02:44.0   2015-05-07 14:12:44.0   API1    Invocations 2
201505071492263037625304    2015-05-07 14:16:29.0   2015-05-07 14:26:30.0   API3    Average 179,223.29

每 10 分钟,在此期间执行的任何 API 都会有一个与上述类似的条目。但是,多个 JVM 将写入同一个数据库,因此开始和结束时间不是简单地每 10 分钟一次,每小时可能有超过 6 个条目。

我要做的是创建一个 SQL,该 SQL 将每小时对所有 API 的所有调用进行分组。例如:

Date&Hour, API, Invocations
2015-05-07 12:00, API1, 100
2015-05-07 12:00, API2, 150
2015-05-07 13:00, API2, 200
etc...

我尝试在小时标记处基于主键的 SUBSTR(始终是时间戳加上一些随机数 - 但小时和分钟之间是 2 个随机数字)进行 GROUP BY,但我不是确定如何添加所有 StatName=Invocations per hour。

有人可以就我如何实现这一点提供一些想法吗?

【问题讨论】:

    标签: sql oracle jdbc db2


    【解决方案1】:

    另一种可能的解决方案:

    select to_char(StarDate,'rrrr-mm-dd HH24:')||'00'  as DateHour,
        APIName as API,
        sum(StatValue) as Invocations
    from STATISTICS
    where StatName = 'Invocations' 
    group by to_char(StarDate,'rrrr-mm-dd HH24:')||'00', APIName
    

    有不同的方法可以做到这一点..

    祝你好运!

    【讨论】:

    • 这里给出的一切都是一个很好的解决方案,但就我的需要而言,这相对简单,似乎对我来说效果最好。谢谢。
    【解决方案2】:

    不太确定你要的是这个吗?

    本质上,它查看 YYYYMMDDHH 10 个位置,因为它们包含要分组的值...然后仅基于调用求和

    SELECT substr(statKey,1,10) as DH, APIName, Sum(Statvalue) Invocations
    FROM TableName
    WHERE StatName = 'Invocations'
    GROUP BY substr(statKey,1,10), APIName, StatName
    

    例子:

    WITH CTE AS
      (SELECT '201505071498224437562706' AS StatKey,
        '2015-05-07 14:12:44.0'          AS StartDate,
        '2015-05-07 14:22:44.0'          AS EndDate,
        'API5'                           AS APIName,
        'Invocations'                    AS StatName,
        34                               AS statvalue
      FROM dual
      UNION ALL
      SELECT '201505071498161437466684',
        '2015-05-07 14:06:14.0',
        '2015-05-07 14:16:14.0',
        'API4',
        'Invocations',
        79
      FROM dual
      UNION ALL
      SELECT '201505071498060937466556',
        '2015-05-07 13:56:08.0',
        '2015-05-07 14:06:08.0',
        'API4',
        'Average',
        26264.37
      FROM dual
      )
    SELECT substr(statKey,1,10) as DH, APIName, StatName, Sum(Statvalue) 
    FROM TableName
    WHERE StatName = 'Invocations'
    GROUP BY substr(statKey,1,10), APIName, StatName
    

    【讨论】:

      【解决方案3】:

      至少对于 DB2,为什么不只是

      select date(startdate) as start_date
             , hour(startdate) as start_hour
             , API
             , sum(statvalue) as Invocations
      from mytbl
      where statname = 'Invocations' 
      group by date(startdate),  hour(startdate), API
      

      如果那是你真正想要的,我将把它作为一个练习让你将日期和时间组合回时间戳......

      【讨论】:

        【解决方案4】:

        SQL Fiddle

        Oracle 11g R2 架构设置

        CREATE TABLE Data AS
                  SELECT '201505071498224437562706' AS StatKey, TO_DATE( '2015-05-07 14:12:44', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:22:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API5' AS APIName, 'Invocations' AS StatName, 34 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071498161437466684' AS StatKey, TO_DATE( '2015-05-07 14:06:14', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:16:14', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API4' AS APIName, 'Invocations' AS StatName, 79 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071498060937466556' AS StatKey, TO_DATE( '2015-05-07 13:56:08', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:06:08', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API4' AS APIName, 'Average' AS StatName, 26264.37 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071497263437627286' AS StatKey, TO_DATE( '2015-05-07 14:16:33', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:34', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 24 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071497262137620812' AS StatKey, TO_DATE( '2015-05-07 14:16:19', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:20', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 24 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071497024537466378' AS StatKey, TO_DATE( '2015-05-07 13:52:43', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API1' AS APIName, 'Average' AS StatName, 6830050 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071497023337466368' AS StatKey, TO_DATE( '2015-05-07 13:52:31', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:32', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API3' AS APIName, 'Average' AS StatName, 31523 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071496023337466361' AS StatKey, TO_DATE( '2015-05-07 13:52:31', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:32', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 1 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071494263837628892' AS StatKey, TO_DATE( '2015-05-07 14:16:36', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:37', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API5' AS APIName, 'Invocations' AS StatName, 68 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071493124437466656' AS StatKey, TO_DATE( '2015-05-07 14:02:44', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:12:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API1' AS APIName, 'Invocations' AS StatName, 2 AS StatValue FROM DUAL
        UNION ALL SELECT '201505071492263037625304' AS StatKey, TO_DATE( '2015-05-07 14:16:29', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:30', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API3' AS APIName, 'Average' AS StatName, 179223.29 AS StatValue FROM DUAL;
        

        查询 1

        SELECT   TRUNC( EndDate, 'HH' ) AS "Date&Hour",
                 APIName,
                 SUM( StatValue ) AS Invocations
        FROM     Data
        WHERE    StatName = 'Invocations'
        GROUP BY TRUNC( EndDate, 'HH' ),
                 APIName
        

        Results

        |             Date&Hour | APINAME | INVOCATIONS |
        |-----------------------|---------|-------------|
        | May, 07 2015 14:00:00 |    API2 |          49 |
        | May, 07 2015 14:00:00 |    API5 |         102 |
        | May, 07 2015 14:00:00 |    API1 |           2 |
        | May, 07 2015 14:00:00 |    API4 |          79 |
        

        【讨论】:

        • DB2 似乎也有TRUNC function,所以我认为这应该适用于两者(但无法测试)。
        【解决方案5】:

        日期函数似乎很难以与数据库无关的方式实现。

        对于与 DB 无关的解决方案,我建议在数据库中创建视图,以隐藏特定于 DB 的代码的实现,因此您可以使用直接选择而不会出现任何语法问题。

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2023-03-14
          • 1970-01-01
          相关资源
          最近更新 更多