【发布时间】:2021-02-15 18:48:51
【问题描述】:
考虑以下几点:
create table submissions (
submission_date date,
submission_id int,
hacker_id int,
score int
);
create table hackers (
hacker_id int,
name varchar(20)
);
insert into submissions values
("2016-03-01", 8494, 20703, 0),("2016-03-01", 22403, 53473,15),
("2016-03-01",23965,79722,60),("2016-03-01",30173,36396,70),
("2016-03-02",34928,20703,0),("2016-03-02",38740,15758,60),
("2016-03-02",42769,79722,25),("2016-03-02",44364,79722,60),
("2016-03-03",45440,20703,0),("2016-03-03",49050,36396,70),
("2016-03-03",50273,79722,5),("2016-03-04",50344,20703,0),
("2016-03-04",51360,44065,90),("2016-03-04",54404,53473,65),
("2016-03-04",61533,79722,45),("2016-03-05",72852,20703,0),
("2016-03-05",74546,38289,0),("2016-03-05",76487,62529,0),
("2016-03-05",82439,36396,10),("2016-03-05",90006,36396,40),
("2016-03-06",90404,20703,0);
create table colleges (
college_id int,
contest_id int
);
insert into hackers values
(15758, 'Rose'),(20703, 'Angela'),
(36396,'Frank'),(38289, 'Patrick'),
(44065, 'Lisa'),(53473,'Kimberly'),
(62529, 'Bonnie'),(79722, 'Michael');
对于这个 HackerRank quiz:
Julia 举办了“15 天 SQL 学习”竞赛。比赛开始日期为 2016 年 3 月 1 日,结束日期为 2016 年 3 月 15 日。
编写查询以打印至少 1 的唯一黑客总数 每天提交(从比赛的第一天开始),并找到每天提交最多提交次数的黑客的hacker_id和名称。如果不止一个这样的黑客有提交的最大数量,打印最低的hacker_id。查询应打印比赛每一天的此信息,按日期排序。
这是我想了解的解决方案:
SELECT submission_date,
(
SELECT COUNT(DISTINCT hacker_id)
FROM Submissions AS SUB2
WHERE SUB2.submission_date = SUB1.submission_date AND
(SELECT COUNT(DISTINCT submission_date)
FROM Submissions AS SUB3
WHERE (SUB3.hacker_id = SUB2.hacker_id) AND
(SUB3.submission_date < SUB1.submission_date))
= DATEDIFF(SUB1.submission_date, '2016-03-01' )
),
(SELECT hacker_id FROM Submissions SUB4
WHERE SUB4.submission_date = SUB1.submission_date
GROUP BY hacker_id
ORDER BY COUNT(submission_id) DESC, hacker_id LIMIT 1) AS HID,
(SELECT name FROM Hackers
WHERE hacker_id = HID)
FROM
(SELECT DISTINCT(submission_date)
FROM Submissions) AS SUB1
我无法理解 2 个部分:
第 1 部分
SELECT COUNT(DISTINCT hacker_id)
FROM Submissions AS SUB2
WHERE SUB2.submission_date = SUB1.submission_date AND
(SELECT COUNT(DISTINCT submission_date)
FROM Submissions AS SUB3
WHERE (SUB3.hacker_id = SUB2.hacker_id) AND
(SUB3.submission_date < SUB1.submission_date))
= DATEDIFF(SUB1.submission_date, '2016-03-01' )
)
以上代码问题: 这部分是如何工作的:
SELECT COUNT(DISTINCT hacker_id FROM Submissions AS SUB2 WHERE
SUB2.submission_date = SUB1.submission_date
适用于
(SELECT
COUNT(DISTINCT submission_date) FROM Submissions AS SUB3 WHERE
(SUB3.hacker_id = SUB2.hacker_id) AND (SUB3.submission_date <
SUB1.submission_date)) = DATEDIFF(SUB1.submission_date,
'2016-03-01' ))
第一部分为给定的提交日期带来所有唯一的hacker_id,第二部分检查hacker_id是否在该日期提交了一致的提交,但是SQL如何确保只检查第一部分中存在的hacker_id(在AND之前)在第二部分(在 AND 之后)
您能否举例说明这两个查询是如何协同工作的?
第 2 部分 对于这部分
(SELECT hacker_id FROM Submissions SUB4 WHERE
SUB4.submission_date = SUB1.submission_date GROUP BY hacker_id
ORDER BY COUNT(submission_id) DESC, hacker_id LIMIT 1) AS HID
如何只检查直到当前日期为止在每个日期提交一致的hacker_id,然后对这些hacker_ids 提交进行分组,然后选择提交次数最多的最低hacker_id?
MRE 这个问题。
【问题讨论】:
-
DISTINCT 不是函数,所以
SELECT DISTINCT(submission_date)有点傻。 -
这里我们从“submission_date”列中选择不同/唯一的值
-
那就是
SELECT DISTINCT submission_date -
现在我明白了!因为它不是一个函数,所以列名不需要在括号内。谢谢@Strawberry