【问题标题】:SQL count distinct per group divided by count distinct of total每组不同的 SQL 计数除以总数的不同计数
【发布时间】:2021-05-09 13:40:08
【问题描述】:

我有:

id value
1 123
1 124
1 125
2 126
2 127
2 127
3 128
3 128
3 128

我想要一个像这样的聚合:

id distinct_count total_distinct percentage
1 3 6 0.5
2 2 6 0.33
3 1 6 0.167

我尝试应用这样的窗口覆盖子句:

SELECT id,
       COUNT(DISTINCT value) AS distinct_count,
       COUNT(DISTINCT value) OVER () AS total_distinct,
       COUNT(DISTINCT value) / COUNT(DISTINCT value) OVER () AS percentage
FROM have
GROUP BY id

但它似乎还没有实现。

有没有办法在没有连接的情况下实现这一点?

【问题讨论】:

  • 你得到了什么结果?
  • 我遇到了一个错误.. 尚不支持窗口函数参数中的 DISTINCT
  • 将此 SO 视为有效的 SUM 聚合示例:stackoverflow.com/questions/46909494/…
  • 为什么要问这个问题?如果你能在 10 分钟内自己找到答案?
  • 不确定你的意思@Luuk。我找不到答案,因此 q

标签: sql distinct


【解决方案1】:

你可以这样做:

SELECT id,
       COUNT(DISTINCT value) AS distinct_count,
       (SELECT COUNT(DISTINCT value) FROM have) AS total_distinct,
       (0.0+COUNT(DISTINCT value)) / (SELECT COUNT(DISTINCT value) FROM have) AS percentage
FROM have
GROUP BY id

或者做:

WITH cte AS (SELECT COUNT(DISTINCT value) AS value FROM have)
SELECT 
       id,
       COUNT(DISTINCT value) AS distinct_count,
       cte.value AS total_distinct,
       (0.0+COUNT(DISTINCT value)) / cte.value AS percentage
FROM have
CROSS APPLY cte
GROUP By cte.value,id;

【讨论】:

  • 谢谢!由于某种原因过于关注 OVER 子句
【解决方案2】:

另一种方法是枚举值并使用条件聚合:

SELECT id,
       SUM(CASE WHEN seqnum_iv = 1 THEN 1 ELSE 0 END) as distinct_count,
       SUM(CASE WHEN seqnum_v = 1 THEN 1 ELSE 0 END) as total_distinct_count,
       (SUM(CASE WHEN seqnum_iv = 1 THEN 1.0 ELSE 0 END) /
        SUM(CASE WHEN seqnum_v = 1 THEN 1.0 ELSE 0 END)
       ) as ratio
FROM (SELECT h.*,
             ROW_NUMBER() OVER (PARTITION BY id, value ORDER BY value) as seqnum_iv,
             ROW_NUMBER() OVER (PARTITION BY value ORDER BY value) as seqnum_v
      FROM have h
     ) h
GROUP BY id;

这可能比使用子查询的方法更快。

【讨论】:

  • 谢谢,我会比较运行时,因为它是在大型事务数据上。
  • @Grizzly2501 。 . .我第一次回答时错过了一些东西。我已经确定了答案。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2017-04-01
  • 2020-08-14
  • 1970-01-01
相关资源
最近更新 更多