【问题标题】:MySQL Query: Aggregation at two different levelMySQL 查询:两个不同级别的聚合
【发布时间】:2014-11-23 08:00:31
【问题描述】:

我有两张桌子

mysql> select * from report;
+----+----------+------------+------------------+-------------+
| id | campaign | advertiser | impression_count | click_count |
+----+----------+------------+------------------+-------------+
|  1 | camp1    | adv1       |               20 |           6 |
|  2 | camp2    | adv2       |               10 |           2 |
|  3 | camp1    | adv1       |               15 |           3 |
|  4 | camp2    | adv2       |                6 |           1 |
+----+----------+------------+------------------+-------------+
4 rows in set (0.00 sec)

mysql> select * from device;
+-----------+-----------+
| report_id | device_id |
+-----------+-----------+
|         1 | d1        |
|         1 | d2        |
|         2 | d1        |
|         2 | d3        |
|         2 | d4        |
|         3 | d2        |
|         3 | d4        |
|         4 | d3        |
|         4 | d4        |
|         4 | d5        |
+-----------+-----------+
10 rows in set (0.00 sec)

我想要在广告系列和广告客户级别汇总的报告,其中包含展示次数和点击次数的总和以及不同的 device_id。所以我写了下面的查询

SELECT 
    campaign,
    advertiser,
    sum(impression_count),
    sum(click_count),
    count(DISTINCT device_id)
FROM report 
LEFT JOIN device ON report.id = device.report_id
GROUP BY campaign, advertiser;
+----------+------------+-----------------------+------------------+---------------------------+
| campaign | advertiser | sum(impression_count) | sum(click_count) | count(distinct device_id) |
+----------+------------+-----------------------+------------------+---------------------------+
| camp1    | adv1       |                    70 |               18 |                         3 |
| camp2    | adv2       |                    48 |                9 |                         4 |
+----------+------------+-----------------------+------------------+---------------------------+

这里是因为加入展示次数和 click_count 是为多行聚合的。想要的是

+----------+------------+-----------------------+------------------+---------------------------+
| campaign | advertiser | sum(impression_count) | sum(click_count) | count(distinct device_id) |
+----------+------------+-----------------------+------------------+---------------------------+
| camp1    | adv1       |                    35 |               9  |                         3 |
| camp2    | adv2       |                    16 |                3 |                         4 |
+----------+------------+-----------------------+------------------+---------------------------+

http://sqlfiddle.com/#!2/05dd9d/1

发现不是很好的解决方案

select campaign,advertiser,ic,cc,count(distinct device_id) 
from (
    select 
        group_concat(id) as id,
        sum(impression_count)as ic,
        sum(click_count)as cc,
        campaign,advertiser 
    FROM report har GROUP BY campaign,advertiser) a 
    LEFT JOIN device dr ON FIND_IN_SET(dr.report_id, a.id) 
    group by a.id
);

但是这使用了组 concat,所以如果 group_concat 结果的长度很大,可能会出现问题。

【问题讨论】:

标签: mysql sql database


【解决方案1】:

您要做的是执行两个不同的查询,然后加入结果集。外部选择只是选择我们真正想要的信息,并将两个临时表连接在一个共同的值上。如果您不想为整个广告系列选择设备表中的不同设备,也可以使用 id 和 report_id 执行此操作。

select `firsttable`.campaign, `firsttable`.advertiser, a, b, c from 
  (select id, campaign, advertiser, sum(impression_count) as a, sum(click_count) as b
   from report
   group by campaign, advertiser
  ) as firsttable
  left join
  (select campaign, advertiser, count(distinct device_id) as c
   from device, report
   where id=report_id
   group by campaign, advertiser
  ) as secondtable on `firsttable`.campaign=`secondtable`.campaign and
                      `firsttable`.advertiser=`secondtable`.advertiser;

SQLFiddle:http://sqlfiddle.com/#!2/8bd63/20

这个查询是这两个临时表的组合:

| ID | CAMPAIGN | ADVERTISER |   A |   B |
|----|----------|------------|-----|-----|
|  1 |    camp1 |       adv1 |  35 |   9 |
|  5 |    camp1 |       adv2 | 900 | 900 |
|  2 |    camp2 |       adv2 |  16 |   3 |

| CAMPAIGN | ADVERTISER | C |
|----------|------------|---|
|    camp1 |       adv1 | 3 |
|    camp2 |       adv2 | 4 |

结果:

| CAMPAIGN | ADVERTISER |   A |   B |      C |
|----------|------------|-----|-----|--------|
|    camp1 |       adv1 |  35 |   9 |      3 |
|    camp1 |       adv2 | 900 | 900 | (null) |
|    camp2 |       adv2 |  16 |   3 |      4 |

您的查询的问题在于,在将报告表与设备表组合时,它会重复行。你最终会得到这样的结果:

| CAMPAIGN | ADVERTISER | IMPRESSION_COUNT | CLICK_COUNT | DEVICE_ID |
|----------|------------|------------------|-------------|-----------|
|    camp1 |       adv1 |               20 |           6 |        d1 |
|    camp1 |       adv1 |               20 |           6 |        d2 |
|    camp2 |       adv2 |               10 |           2 |        d1 |
|    camp2 |       adv2 |               10 |           2 |        d3 |
|    camp2 |       adv2 |               10 |           2 |        d4 |
|    camp1 |       adv1 |               15 |           3 |        d2 |
|    camp1 |       adv1 |               15 |           3 |        d4 |
|    camp2 |       adv2 |                6 |           1 |        d3 |
|    camp2 |       adv2 |                6 |           1 |        d4 |
|    camp2 |       adv2 |                6 |           1 |        d5 |
|    camp1 |       adv2 |              900 |         900 |    (null) |

【讨论】:

    【解决方案2】:

    也许这对你有帮助:

    SELECT 
        campaign,
        advertiser,
        SUM(impression_count) AS ic,
        sum(click_count) as cc,
        (select 
                count(distinct device_id)
            from
                device
            where
                report_id = id) AS DD
    from
        report
    group by campaign , advertiser; 
    

    【讨论】:

    • 这将 DD 分别返回为 2 和 3,需要的是 3 和 4
    • 此方法有效,但使用组 concatSELECT 广告系列、广告商、SUM(impression_count) AS ic、sum(click_count) as cc、(从设备中选择 count(distinct device_id) FIND_IN_SET(report_id, group_concat(id) )) 按广告系列、广告客户的报告组作为 DD;
    猜你喜欢
    • 2018-01-12
    • 2012-02-18
    • 2020-02-26
    • 2019-05-06
    • 2010-10-29
    • 2014-02-23
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多