【问题标题】:SQL query to find the number of customers who shopped for 3 consecutive days in month of January 2020SQL查询查找2020年1月连续3天购物的客户数量
【发布时间】:2021-02-02 03:05:26
【问题描述】:

我有下表称为订单,其中包含客户 ID 及其订单日期(注意:同一天可以有来自同一客户的多个订单)

create table orders (Id char, order_dt date)

insert into orders values
('A','1/1/2020'),
('B','1/1/2020'),
('C','1/1/2020'),
('D','1/1/2020'),
('A','1/1/2020'),
('B','1/1/2020'),
('A','2/1/2020'),
('B','2/1/2020'),
('C','2/1/2020'),
('B','2/1/2020'),
('A','3/1/2020'),
('B','3/1/2020')

我正在尝试编写一个 SQL 查询来查找在 2020 年 1 月连续 3 天购物的客户数量

基于上述顺序值,输出应为:2

我提到了其他类似的问题,但仍然无法找到确切的解决方案

【问题讨论】:

  • 编辑问题以在一天内添加来自同一客户的多个订单

标签: sql oracle group-by window-functions


【解决方案1】:

这是我的解决方案,即使一天内一位客户有很多订单,它也能正常工作;

搭建测试环境的一些脚本:

create table orders (Id varchar2(1), order_dt date);

insert into orders values('A',to_date('01/01/2020','dd/mm/yyyy'));
insert into orders values('B',to_date('01/01/2020','dd/mm/yyyy'));
insert into orders values('C',to_date('01/01/2020','dd/mm/yyyy'));
insert into orders values('D',to_date('01/01/2020','dd/mm/yyyy'));
insert into orders values('A',to_date('01/01/2020','dd/mm/yyyy'));
insert into orders values('B',to_date('01/01/2020','dd/mm/yyyy'));
insert into orders values('A',to_date('02/01/2020','dd/mm/yyyy'));
insert into orders values('B',to_date('02/01/2020','dd/mm/yyyy'));
insert into orders values('C',to_date('02/01/2020','dd/mm/yyyy'));
insert into orders values('B',to_date('02/01/2020','dd/mm/yyyy'));
insert into orders values('A',to_date('03/01/2020','dd/mm/yyyy'));
insert into orders values('B',to_date('03/01/2020','dd/mm/yyyy'));


 select distinct id,  count_days from (
    select id,
           order_dt,
           count(*) over(partition by id order by order_dt range between 1  preceding  and 1 following  )  count_days
    from orders group by  id, order_dt
)
where count_days = 3;

--  Insert for test more days than 3 consecutive

insert into orders values('A',to_date('04/01/2020','dd/mm/yyyy'));

【讨论】:

  • 您的解决方案看起来很优雅,我想将其视为答案。但是当我尝试运行相同的程序时,我遇到了错误 ORA-00922。在fiddle link 上进行了相同的测试
  • @bzflag 在小提琴中,您必须将脚本拆分为单独的脚本。 dbfiddle.uk/… 所以创建表和每个插入和选择 - 在单独的字段中。
  • 这真是太棒了,我用非连续日期测试了数据集,它仍然有效。接受这个作为解决方案。谢谢
【解决方案2】:

嗯嗯。 . .一种方法是使用lead()/lag()。假设您在一天内没有重复,那么:

select distinct id
from (select o.*,
             lag(order_dt) over (partition by id order by order_dt) as prev_order_dt,
             lag(order_dt, 2) over (partition by id order by order_dt) as prev_order_dt2
      from orders o
      where order_dt >= date '2020-01-01' and 
            order_dt < date '2020-02-01'
     ) o
where prev_order_dt = order_dt - interval '1' day and
      prev_order_dt2 = order_dt - interval '2' day;

编辑:

如果表有重复记录,上面很容易调整:

select distinct id
from (select o.*,
             lag(order_dt) over (partition by id order by order_dt) as prev_order_dt,
             lag(order_dt, 2) over (partition by id order by order_dt) as prev_order_dt2
      from (select distinct o.id, trunc(order_dt) as order_dt
            from orders o
            where order_dt >= date '2020-01-01' and 
                  order_dt < date '2020-02-01'
           ) o
     ) o
where prev_order_dt = order_dt - interval '1' day and
      prev_order_dt2 = order_dt - interval '2' day;

【讨论】:

  • 实际上,该表有重复记录(编辑相同的问题)。
  • @bzflag 。 . .我调整了答案。
  • 我测试了第二个解决方案,它按预期运行。测试了解决方案here
【解决方案3】:

为什么不在接下来的两天内根据相同的情况加入两次。只要您对客户的 ID 和日期有索引,就应该优化连接。因为连接需要在相同的开始日期基础上匹配,所以它要么找到要么不找到。如果不是,则将其排除在结果集之外。

select distinct 
      o1.id
   from
      orders o1
         JOIN orders o2
           on o1.id = o2.id
           AND o1.order_dt = o2.order_dt - interval '1' day
         JOIN orders o3
           on o1.id = o3.id
           AND o1.order_dt = o3.order_dt - interval '2' day

【讨论】:

  • 我测试了解决方案,这是一个完美的解决方案(易于向用户解释),但是我选择了另一种方法作为答案(由于其可扩展性以扩大范围,因为您的方法需要更多连接意味着更多的缩放计算)。测试解决方案 [这里] (dbfiddle.uk/…)
【解决方案4】:

您可以使用两个窗口函数来计算连续日期之间的差异,并使用 ROWS 偏移量的滑动窗口来计算不同的连续日期。示例here

with gen as (
  select 1 as cust_id, (date '2020-01-10') + 1 as q from dual union all
  select 1, (date '2020-01-10') + 2 as q from dual union all
  select 1, (date '2020-01-10') + 3 as q from dual union all
  select 1, (date '2020-01-10') + 3 as q from dual union all
  select 1, (date '2020-01-10') + 5 as q from dual union all
  select 1, (date '2020-01-10') + 7 as q from dual union all
  select 1, (date '2020-01-10') + 8 as q from dual union all
  select 1, (date '2020-01-10') + 9 as q from dual
)
, diff as (
  select gen.*
   , q - lag(q) over(partition by cust_id, trunc(q, 'mm') order by q asc) as datediff
  from gen
)
, window as (
  select diff.*
    , sum(decode(datediff, 1, 1, 0)) over(partition by cust_id, trunc(q, 'mm') order by q asc range between 2 preceding and current row) as cnt
  from diff
)
select sum(count(distinct q)) as cnt
from window
where cnt = 2
group by cust_id

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2021-01-15
    • 1970-01-01
    • 2023-01-24
    • 2021-11-10
    • 2014-06-18
    • 1970-01-01
    • 2020-05-21
    • 1970-01-01
    相关资源
    最近更新 更多