【发布时间】:2020-08-25 13:56:56
【问题描述】:
我正在尝试创建一个 SQL 语句来找出哪些客户没有连续参加三个活动
表 1 - 客户: 客户 ID、客户名称
+-------------+---------------+
| Customer ID | Customer Name |
+-------------+---------------+
| 01 | Customer 01 |
| 02 | Customer 02 |
| 03 | Customer 03 |
+-------------+---------------+
表 2 - 事件 事件 ID、事件日期、事件名称
+----------------------------------+
| Event ID Event Date Event Name |
+----------------------------------+
| 01 01/01/2020 Event 01 |
| 02 01/15/2020 Event 02 |
| 03 02/15/2020 Event 03 |
| 04 03/13/2020 Event 04 |
| 05 05/17/2020 Event 05 |
| 06 06/20/2020 Event 06 |
+----------------------------------+
表 3 - 事件活动 事件 ID、客户 ID
+----------+-------------+----+
| Event ID | Customer ID | |
+----------+-------------+----+
| 01 | | 01 |
| 01 | | 02 |
| 01 | | 03 |
| 02 | | 01 |
| 03 | | 01 |
| 03 | | 02 |
| 04 | | 01 |
| 05 | | 01 |
| 06 | | 01 |
| 06 | | 03 |
+----------+-------------+----+
现在我正在寻找那些连续没有参加 3 场活动的客户。
所以在给定的示例中,客户 2 和客户 3。
我使用了史蒂夫的建议。这里是更新的 SQL 语句:
drop table if exists dbo.customer;
create table dbo.customer(
CustID int not null,
CustName varchar(20) not null);
insert dbo.customer(CustID, CustName) values
(1,'Cust 1'),
(2,'Cust 2'),
(3,'Cust 3'),
(4,'Cust 4'),
(5,'Cust 5')
;
drop table if exists dbo.events;
create table dbo.events(
EventID int not null,
EventDate date not null,
EventName varchar(20) not null);
insert dbo.events(EventId, EventDate, EventName) values
(1,'2020-01-01','Event 1'),
(2,'2020-01-15','Event 2'),
(3,'2020-02-15','Event 3'),
(4,'2020-03-13','Event 4'),
(5,'2020-05-17','Event 5'),
(6,'2020-06-20','Event 6');
drop table if exists dbo.eventactivity;
create table dbo.eventactivity(
EventID int not null,
CustID int not null);
insert dbo.eventactivity(EventID, CustID) values
(1,1),
(1,2),
(1,3),
(1,4),
(1,5),
(2,1),
(2,2),
(2,4),
(2,5),
(3,1),
(3,5),
(4,1),
(4,5),
(5,1),
(5,2),
(5,3),
(5,5),
(6,1),
(6,2),
(6,3);
(6,5);
这里:
;with
events_sorted as (
select e.*, row_number() over (order by EventDate) seq from dbo.events e),
activity_lag as
(
select
a.*, e.seq,
lag(e.seq, 1, 0) over (partition by CustId order by e.seq) lag_seq,
iif(lag(e.seq, 1, 0) over (partition by CustId order by e.seq)=0, 1,
iif((e.seq-lag(e.seq, 1, 0) over (partition by CustId order by e.seq))>1, 1, 0)) seq_break
from dbo.eventactivity a
join events_sorted e on a.EventID=e.EventID
),
activity_lag_sum as (
select
alag.*, sum(seq_break) over (partition by CustId order by alag.seq) seq_grp
from
activity_lag alag
),
three_in_a_row_cte as (
select distinct CustId
from activity_lag_sum
group by CustID, seq_grp
having count(*)>=3
)
select *
from customer c
where not exists(select 1
from three_in_a_row_cte r
where c.CustID=r.CustID);
问题是,这会返回客户 2、客户 3、客户 4 - 客户 2 确实参加了 2 个活动,跳过了 2 个,参加了 2 个,所以客户 2 不应该在列表中。
有什么建议吗?
【问题讨论】:
-
您使用的是哪种 DBMS 产品? “SQL”只是所有关系数据库都使用的一种查询语言,而不是特定数据库产品的名称。请为您使用的数据库产品添加tag。 Why should I tag my DBMS
-
所以您正在寻找没有连续参加过三场活动的客户?或者您是在寻找连续三场活动中至少失踪一次的客户?
-
您好 Jere,第一部分 - “连续未参加过 3 场活动的客户” - 或者换句话说 - 未连续参加 3 场活动的客户。
-
你试过什么了吗???
标签: sql sql-server gaps-and-islands