【发布时间】:2019-09-05 23:28:50
【问题描述】:
我已经创建了一个临时表,我在其中提取了字段(具有多个表示属性的值,现在我想创建一个逻辑来比较这些属性并创建一个新字段来总结 ref_type 和 post_campaign 字段。
我正在尝试根据以下逻辑/条件创建一个新列 (x):
> > if post_campaign starts with KNC-% and ref_type = 3 then create a new
column (x) with with field PS
> > if post_campaign is null and ref_type = 3, then create a new column (x) with field OS
> > if post_campaign starts with SNP-%, then create a new column (x) with field Pso
> > if post_campaign starts with SNO-% and ref_type = 9, then create a new column (x) with field OPso
> > if ref_type=6 then create a new column (x) with field Dir
我已经创建了临时表代码,但是需要关于如何在 sql 查询中插入上述逻辑的帮助
create table temp.Register
Select date(date_time) as date, post_evar10, count(page_event) as Pageviews, concat(post_visid_high, post_visid_low) as UniqueVisitors, ref_type as Source_Traffic, paid_search, post_campaign
from a_hits
where ref_type in (3,6,7,9)
and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
and page_event like '0'
and exclude_hit like '0'
and hit_source not in (5,7,8,9)
group by Date, post_evar10, UniqueVisitors, Source_Traffic, paid_search;
预期结果将是我将看到的新列:
Date Post_evar10 Pageviews UniqueVisitors Source_Traffic post_campaign Column X
2/2/2019 event-summary 540 200 3 KNC-% PS
2/2/2019 event-summary 300 150 3 Null OS
2/3/2019 event-summary 230 100 9 SNO-% Opso
2/4/2019 event-summary 290 150 9 SNP-% Pso
2/5/2019 event-summary 100 300 6 Misc Dir
【问题讨论】:
-
该查询是否有效?你有两个
FROM子句 -
这个是mysql还是spark?
-
@JerryM.:这是复制粘贴错误。
-
@PatrickSmith:我正在为 databricks 编写 sql 代码,所以它是 sparksql。
-
sparksql的哪个版本?1.2.0?
标签: sql apache-spark apache-spark-sql