【问题标题】:Avoid cartesian product using sum避免使用 sum 的笛卡尔积
【发布时间】:2019-10-20 18:57:57
【问题描述】:

我想总结来自tickets 表的stake,将其按customer_id 和来自bonus 表的date_trunc('day') 分组。

问题是行数成倍增加,我不知道如何解决。

https://www.db-fiddle.com/f/yWCvFamMAY9uGtoZupiAQ/4

CREATE TABLE tickets (
    ticket_id integer,
    customer_id integer,
  stake integer,
  reg_date date
);

CREATE TABLE bonus (
bonus_id integer,
customer_id integer,
reg_date date
);

insert into tickets 
values
(1,100, 12,'2019-01-10 11:00'),
(2,100, 10,'2019-01-10 12:00'),
(3,100, 30,'2019-01-10 13:00'),
(4,100, 10,'2019-01-11 14:00'),
(5,100, 15,'2019-01-11 15:00'),

(6,102, 25,'2019-01-10 10:00'),
(7,102, 25,'2019-01-10 11:10'),
(8,102, 13,'2019-01-11 12:40'),
(9,102, 9,'2019-01-12 15:00'),
(10,102, 7,'2019-01-13 18:00'),


(13,103, 15,'2019-01-12 19:00'),
(14,103, 11,'2019-01-12 22:00'),
(15,103, 11,'2019-01-14 02:00'),
(16,103, 11,'2019-01-14 10:00')
;

insert into bonus
values
(200,100,'2019-01-10 05:00'),
(201,100,'2019-01-10 06:00'),
(202,100,'2019-01-10 15:00'),
(203,100,'2019-01-10 15:50'),
(204,100,'2019-01-10 16:10'),
(205,100,'2019-01-10 16:15'),
(206,100,'2019-01-10 16:22'),
(207,100,'2019-01-11 10:10'),
(208,100,'2019-01-11 16:10'),

(209,102,'2019-01-10 10:00'),
(210,102,'2019-01-10 11:00'),
(211,102,'2019-01-10 12:00'),
(212,102,'2019-01-10 13:00'),

(213,103,'2019-01-11 11:00'),
(214,103,'2019-01-11 18:00'),
(215,103,'2019-01-12 15:00'),
(216,103,'2019-01-12 16:00'),
(217,103,'2019-01-14 02:00')




select 
customer_id, 
date_trunc('day', b.reg_date), 
sum(t.stake)

from tickets t
join bonus b using (customer_id)
where date_trunc('day', b.reg_date) = date_trunc('day', t.reg_date)
group by 1,2
order by 1

客户 102 的输出应该是:

102,2019-01-10, 50

【问题讨论】:

    标签: sql postgresql-9.4 cartesian-product


    【解决方案1】:

    好的,我想你想得到tickets表中stake列的汇总数据和记录的customer_id, reg_date对已经出现在第二个表bonus中,所有业务都与bonus_id,对吗? bonus中的customer_id, reg_date对是重复的,所以你需要一个distinct,然后join来自tickets的sum数据。完整的SQL和结果如下:

    with stake_sum as (
    select
        customer_id,
        reg_date,
        sum(stake)
    from
        tickets
    group by
        customer_id,
        reg_date
    )
    ,bonus_date_distinct as (
    select
        distinct customer_id,
        reg_date
    from
        bonus
    )
    select
        a.*
    from
        stake_sum a
    join
        bonus_date_distinct b on a.customer_id = b.customer_id and a.reg_date = b.reg_date order by customer_id, reg_date;
     customer_id |  reg_date  | sum 
    -------------+------------+-----
             100 | 2019-01-10 |  52
             100 | 2019-01-11 |  25
             102 | 2019-01-10 |  50
             103 | 2019-01-12 |  26
             103 | 2019-01-14 |  22
    (5 rows)
    

    【讨论】:

    • 糟糕,我错误地声明了 reg_date 的数据类型,它应该是时间戳而不是日期,这就是我使用 date_trunc 的原因。但是我修改了你的代码,它运行良好,谢谢!
    • 即使你使用时间戳类型,你也可以使用reg_date::date将类型转换为日期,然后完成你的join,它比date_trunc少字符,所以是更简洁:D
    猜你喜欢
    • 2023-03-06
    • 1970-01-01
    • 2017-05-27
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-05-27
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多