【问题标题】:Hive Explode / Lateral View TableHive 爆炸/侧视图表
【发布时间】:2019-12-27 09:47:24
【问题描述】:

表:jd1(比较表)

表:data1(新值表)

我已经在 sql server 中编写了这个查询,它正在工作,但是在 hive 中它显示了一个错误

select * from data1;

1   siva    hadoop
1   siva    hive
1   siva    spark
1   siva    hbase
1   siva    mapreduce
1   siva    hdfs
2   kumar   hadoop
2   kumar   hive
2   kumar   python
2   kumar   spark
3   naveen  hive
3   naveen  hadoop
3   naveen  flume
3   naveen  kafka

从 jd1 中选择 *;

1   hadoop
1   hive
1   spark
1   hbase
1   mapreduce
1   hdfs
1   python
1   java  

预期输出

1   siva    6   85.71428571428571
2   kumar   4   57.142857142857146
3   naveen  2   28.571428571428573

我的查询

select id, name, count(*), ((count(*)*100)/(select count(skills)from jd1))avg
from (select n.id, n.name, n.skills
      from data1 n join jd1 t on (n.skills=t.skills))a
group by id,name;

错误

FAILED: ParseException line 1:44 cannot recognize input near 'select' 'count' '(' in expression specification

【问题讨论】:

    标签: hive hiveql


    【解决方案1】:
    select id, name, count(*) cnt, count(*)*100/skill_cnt cnt_pct
    from (select n.id, n.name, n.skills, t.skill_cnt 
          from data1 n 
               inner join (select skills, count(*) over() skill_cnt from jd1) t 
                          on n.skills=t.skills
         ) a
    group by id,name;
    

    【讨论】:

      【解决方案2】:

      您可以尝试以下查询 -

      SELECT n.id, n.name, COUNT(n.skills), COUNT(n.skills)/skill_cnt.total_skill
      FROM data1 n
      JOIN jd1 t ON n.skills=t.skills
      CROSS JOIN (SELECT COUNT(*) total_skill FROM jd1) skill_cnt
      GROUP BY n.id, n.name, total_skill
      

      【讨论】:

      • 显示此错误 FAILED: SemanticException [Error 10025]: Line 3:32 Expression not in GROUP BY key 'TOTAL_SKILLS'
      • @Siva,立即尝试。
      • @Ankit Bajpai.FAILED: SemanticException [Error 10004]: Line 6:23 Invalid table alias or column reference 'TOTAL_SKILLS': (可能的列名是:n.id, n.name, n.技能,t.id,t.skills)
      • @Ankit Bajpai。失败:ParseException 行 1:42 无法识别表达式规范中“SELECT”“COUNT”“(”附近的输入
      • 现在可以试一试吗?
      【解决方案3】:

      为 jd 技能再创建一个表作为 Skill_count 并加入这些表。

      SELECT n.Id, n.Job_Id, n.Name, n.Email, n.Mobile_Number, n.Education, n.Total_Experiance,((count(n.skills)*100)/s.skill_count) Average
      FROM new_resume n
      JOIN new_jd t ON n.skills=t.skills
      JOIN skill_count s ON n.job_id = s.job_id
      GROUP BY n.Id, n.Job_Id, n.Name, n.Email, n.Mobile_Number, n.Education, n.Total_Experiance,s.skill_count;
      

      【讨论】:

        猜你喜欢
        • 2017-07-13
        • 2020-03-01
        • 1970-01-01
        • 2018-01-02
        • 2022-08-23
        • 1970-01-01
        • 2017-04-04
        • 2021-06-03
        • 2018-12-21
        相关资源
        最近更新 更多