【问题标题】:Grouping the results of a query with CTEs使用 CTE 对查询结果进行分组
【发布时间】:2012-10-04 23:10:43
【问题描述】:

我有一个基于 CTE 的查询,我向其中传递了大约 2600 个 4 元组纬度/经度值 - 这些值已被 ID 标记并保存在名为坐标的第二个表中。这些左上角和右下角的纬度/经度值被传递到 CTE,以显示给定两个时间戳在这些坐标内发出的请求量(每小时)。

但是,我想在给定的时间戳内获取每天的总请求数。也就是说,我想获取每个指定日期的用户请求总数。例如。用户选择每周三或周三和周四等查看 - 在 2012 年 1 月 1 日至 16 日期间的 11:55 至 22:04 之间,我通过的每个纬度/经度 4 元组。输出基本上是这样的:

coordinates_id | stamp       | zcount

1                Jan 4 2012    200 (total requests on Wednesday Jan 4 between 11:55 and 22:04)
1                Jan 11 2012   121 (total requests on Wednesday Jan 11 between 11:55 and 22:04)
2                Jan 4 2012    255 (total requests on Wednesday Jan 4 between 11:55 and 22:04)
2                Jan 11 2012   211 (total requests on Wednesday Jan 11 between 11:55 and 22:04)
.
.
.

我该怎么做?我的查询如下:

WITH v AS (
   SELECT '2012-01-1 11:55:11'::timestamp AS _from -- provide times once
         ,'2012-01-16 22:02:21'::timestamp AS _to
   )
, q AS (
   SELECT c.coordinates_id
        , date_trunc('hour', t.calltime) AS stamp
        , count(*) AS zcount
   FROM   v
   JOIN   mytable t ON  t.calltime BETWEEN v._from AND v._to
                   AND (t.calltime::time >= v._from::time AND
                        t.calltime::time <= v._to::time) AND 
(extract(DOW from t.calltime) = 3)
   JOIN   coordinates c ON (t.lat, t.lon) 
                   BETWEEN (c.bottomrightlat, c.topleftlon)
                       AND (c.topleftlat, c.bottomrightlon)
   GROUP BY c.coordinates_id, date_trunc('hour', t.calltime)
   )
, cal AS (
   SELECT generate_series('2011-2-2 00:00:00'::timestamp
                        , '2012-4-1 05:00:00'::timestamp
                        , '1 hour'::interval) AS stamp
   FROM v
   )
SELECT q.coordinates_id, cal.stamp, COALESCE (q.zcount, 0) AS zcount
FROM v, cal
LEFT JOIN q USING (stamp)
WHERE (extract(hour from cal.stamp) >= extract(hour from v._from) AND
       extract(hour from cal.stamp) <= extract(hour from v._to)) AND 
(extract(DOW from cal.stamp) = 3)
       AND cal.stamp >= v._from AND cal.stamp <= v._to
GROUP BY q.coordinates_id, cal.stamp, q.zcount
ORDER BY q.coordinates_id ASC, stamp ASC;

它产生的样本结果是这样的:

coordinates_id  | stamp                | zcount
1                 2012-01-04 16:00:00    1
1                 2012-01-04 19:00:00    1
1                 2012-01-11 14:00:00    1
1                 2012-01-11 17:00:00    1
1                 2012-01-11 19:00:00    1
2                 2012-01-04 16:00:00    1

所以,正如我上面提到的,我希望将其视为

coordinates_id  | stamp      | zcount
1                2012-01-04    2
1                2012-01-11    3
2                2012-01-04    1

【问题讨论】:

    标签: sql postgresql timestamp aggregate-functions common-table-expression


    【解决方案1】:

    将您的最终SELECT 更改为:

    SELECT q.coordinates_id, cal.stamp::date, sum(q.zcount) AS zcount
    FROM   v, cal
    LEFT   JOIN q USING (stamp)
    WHERE  extract(hour from cal.stamp) BETWEEN extract(hour from v._from)
                                            AND extract(hour from v._to)
    AND    extract(DOW from cal.stamp) = 3 
    AND    cal.stamp >= v._from
    AND    cal.stamp <= v._to
    GROUP  BY 1,2
    ORDER  BY 1,2;
    

    cal.stamp 转换为迄今为止的关键部分:cal.stamp::date
    那,还有sum(q.zcount)

    【讨论】:

    • 我刚刚看到一篇帖子,您被称为 PostgreSQL 的“好人 Greg”。我同意这一点。
    • 刚注意到一个小问题——实际上我一开始就忘了问这个问题——在我的“初始”版本中——当我只需要总查询数而不需要像纬度/经度这样的其他东西时——由于COALESCE (q.zcount, 0),我的查询还打印了计数为 0 的行。我如何将它应用于这种情况以获取 zcount 0 的行?我将COALESCE 应用到sum(q.zcount),但它没有按我的计划工作。
    • @sm90901 缺少行数还是zcount低于预期?
    • 行丢失。我只是尝试获取 2012 年 1 月 1 日至 31 日之间的所有星期一(2、9、16、23、30),但某些坐标 ID 值仅获得 1 月 16 日和 23 行,没有出现 zcount 0 的行。
    • @sm90901:这不应该发生。您确定您在 CTE v 中为 generate_series() 使用了正确的时间戳吗?
    猜你喜欢
    • 2012-12-21
    • 2016-12-06
    • 1970-01-01
    • 1970-01-01
    • 2019-10-25
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多