【问题标题】:find total unique number of hackers who made at least one submission every day and find the hacker_id who made maximum number of submissions each day查找每天至少提交一次的唯一黑客总数,并查找每天提交最多次数的hacker_id
【发布时间】:2019-12-17 08:46:37
【问题描述】:

查找每天至少提交的唯一黑客总数(从比赛的第一天开始),并查找每天提交最多的黑客的hacker_id 和名称。如果不止一个这样的黑客有提交的最大数量,打印最低的hacker_id。查询应打印比赛每一天的此信息,按日期排序。

这里是示例数据: 黑客表:

15758   Rose
20703   Angela
36396   Frank
38289   Patrick
44065   Lisa
53473   Kimberly
62529   Bonnie
79722   Michael

Submissions table:
Submission_date submission_id hacker_id score
3/1/2016    8494    20703   0
3/1/2016    22403   53473   15
3/1/2016    23965   79722   60
3/1/2016    30173   36396   70
3/2/2016    34928   20703   0
3/2/2016    38740   15758   60
3/2/2016    42769   79722   25
3/2/2016    44364   79722   60
3/3/2016    45440   20703   0
3/3/2016    49050   36396   70
3/3/2016    50273   79722   5
3/4/2016    50344   20703   0
3/4/2016    51360   44065   90
3/4/2016    54404   53473   65
3/4/2016    61533   79722   45
3/5/2016    72852   20703   0
3/5/2016    74546   38289   0
3/5/2016    76487   62529   0
3/5/2016    82439   36396   10
3/5/2016    90006   36396   40
3/6/2016    90404   20703   0 

for the above data, expected results is:
2016-03-01 4 20703 Angela
2016-03-02 2 79722 Michael
2016-03-03 2 20703 Angela
2016-03-04 2 20703 Angela
2016-03-05 1 36396 Frank
2016-03-06 1 20703 Angela

我下面的查询没有给我唯一的hacker_ids

select submission_date, cnt, hacker_id, name from 
(select s.submission_date
, count(s.hacker_id) over(partition by s.submission_date) cnt
, row_number() over(partition by s.submission_date order by s.hacker_id asc) rn
, s.hacker_id, h.name from submissions s
inner join hackers h on h.hacker_id = s.hacker_id) as tble
where tble.rn = 1;

如何获得上述结果中唯一的hacker_ids?

【问题讨论】:

  • what about::> select distinct s.hackerid from submits s inner join hackers h on h.hacker_id = s.hacker_id) as tble where tble.rn = 1;

标签: sql sql-server window-functions


【解决方案1】:

您可以使用两个级别的聚合:

select s.submission_date, count(*) as num_hackers, sum(cnt) as num_hacks,
       max(case when seqnum = 1 then h.hacker_id end) as hacker_id,
       max(case when seqnum = 1 then h.name end) as name,
from (select s.submission_date, s.hacker_id, count(*) as cnt
             row_number() over(partition by s.submission_date order by count(*) desc) as seqnum
      from submissions s
      group by s.submission_date, s.hacker_id
     ) s join
     hackers h
     on h.hacker_id = s.hacker_id
group by s.submission_date;

请注意,子查询是按日期和hacker_id 聚合的,因此每个日期每个hacker_id 都有一行。外部查询中的count(*) 正在计算这些行,也就是黑客的数量。我包括了 hack 的数量。

编辑:

我意识到您可以在子查询中执行额外的分析功能,这将简化逻辑:

select s.submission_date, s.num_hackers, num_hacks,
       h.hacker_id, h.name
from (select s.submission_date, s.hacker_id, count(*) as cnt,
             sum(count(*)) over (partition by s.submission_date) as num_hacks,
             count(*) over (partition by s.submission_date) as num_hackers,
             row_number() over(partition by s.submission_date order by count(*) desc) as seqnum
      from submissions s
      group by s.submission_date, s.hacker_id
     ) s join
     hackers h
     on h.hacker_id = s.hacker_id
where seqnum = 1;

【讨论】:

  • 3 月 2 日至 5 日期间唯一黑客的总数为 2,因为从第 1 天到 3 月 5 日只有 20703 & 79722 保持一致。您的查询给我的总数为3 for Mar 2nd & 3rd, and 4 as for Mar 4th &5th
  • @gabs 。 . .设置一个 dbfiddle,但我在 3 月 2 日看到 3(20703、79722、15758),所以“3”看起来正确。
  • 试图设置 db fiddle,但是在这种情况下,唯一的黑客意味着从第一天开始提交的唯一黑客。即使 (20703, 79722, 15758) 已在 3 月 2 日提交,但只有20703、79722 从 3 月 1 日开始提交。
  • 我是新手,不确定格式化日期类型。请告知dbfiddle.uk/…
  • @gabs 。 . .我对你试图解释的内容感到迷茫。我认为这回答了您实际提出的问题(每天都有独特的黑客)。它似乎没有做你想做的事。你能再问一个更清楚的解释吗?
【解决方案2】:
select big_1.submission_date, big_1.hkr_cnt, big_2.hacker_id, h.name
from
(select submission_date, count(distinct hacker_id) as hkr_cnt
from 
(select s.*
, dense_rank() over(order by submission_date) as date_rank
--, row_number() over(order by submission_date) as rn_date_rank
,dense_rank() over(partition by hacker_id order by submission_date) as hacker_rank 
--,row_number() over(partition by hacker_id order by submission_date) as rn_hacker_rank 
from submissions s ) a 
where a.date_rank = a.hacker_rank 
group by submission_date) big_1 
join 
(select submission_date,hacker_id, 
 rank() over(partition by submission_date order by sub_cnt desc, hacker_id) as max_rank 
from (select submission_date, hacker_id, count(*) as sub_cnt 
      from submissions 
      group by submission_date, hacker_id) b ) big_2
on big_1.submission_date = big_2.submission_date and big_2.max_rank = 1 
join hackers h on h.hacker_id = big_2.hacker_id 
order by 1 ;

【讨论】:

    猜你喜欢
    • 2016-10-28
    • 2020-04-14
    • 2019-01-26
    • 2015-10-13
    • 2019-08-12
    • 2022-07-07
    • 1970-01-01
    • 2017-09-02
    • 1970-01-01
    相关资源
    最近更新 更多