基于计数的排名 (hiveql)答案

【问题标题】：rank based on counts (hiveql)基于计数的排名 (hiveql)
【发布时间】：2021-01-21 18:10:52
【问题描述】：

我想按会话 id 的出现次数对它们进行排名，所以第一次出现排名为 1，第二次排名为 2，第三次排名为 3，依此类推。

我遇到了一个语法错误，所以很可能是有问题

select 
    conversationid, 
    rank() over (partition by conversationid order by count(*) desc) as rnk
  from my_table
  group by conversationid

编译语句时出错：FAILED: SemanticException 无法将窗口调用分解为组。至少 1 个组必须仅依赖于输入列。还要检查循环依赖。潜在错误：org.apache.hadoop.hive.ql.parse.SemanticException: line 7:54 Not yet supported place for UDAF 'count'

【问题讨论】：

你到底得到了哪个错误？请edit your question 包含此重要信息。此外，样本数据和预期结果将有助于了解您想要实现的目标。
刚刚更新谢谢

标签： hive count hql hiveql window-functions

【解决方案1】：

如果您想按数量对对话进行排名，那么您不想在窗口函数中使用partition by 子句：

select conversationid, rank() over(order by count(*) desc) rnk
from mytable
group by conversationid

这会将排名1 分配给最频繁的对话。

【讨论】：