【问题标题】:How to add logic statement in the sql code?如何在sql代码中添加逻辑语句?
【发布时间】:2019-09-05 23:28:50
【问题描述】:

我已经创建了一个临时表,我在其中提取了字段(具有多个表示属性的值,现在我想创建一个逻辑来比较这些属性并创建一个新字段来总结 ref_type 和 post_campaign 字段。

我正在尝试根据以下逻辑/条件创建一个新列 (x):

> > if post_campaign starts with KNC-% and ref_type = 3 then create a new
column (x) with with field PS 
> > if post_campaign is null and ref_type = 3, then create a new column (x) with field OS 
> > if post_campaign starts with SNP-%, then create a new column (x) with field Pso 
> > if post_campaign starts with SNO-% and ref_type = 9, then create a new  column (x) with field OPso
> > if ref_type=6 then create a new column (x) with field Dir

我已经创建了临时表代码,但是需要关于如何在 sql 查询中插入上述逻辑的帮助

create table temp.Register
Select date(date_time) as date, post_evar10, count(page_event) as Pageviews, concat(post_visid_high, post_visid_low) as UniqueVisitors, ref_type as Source_Traffic, paid_search, post_campaign
from a_hits
where ref_type in (3,6,7,9)
and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
and page_event like '0'
and exclude_hit like '0'
and hit_source not in (5,7,8,9)
group by Date, post_evar10, UniqueVisitors, Source_Traffic, paid_search;

预期结果将是我将看到的新列:

Date    Post_evar10 Pageviews   UniqueVisitors  Source_Traffic  post_campaign   Column X
2/2/2019    event-summary   540 200 3   KNC-%   PS
2/2/2019    event-summary   300 150 3   Null    OS
2/3/2019    event-summary   230 100 9   SNO-%   Opso
2/4/2019    event-summary   290 150 9   SNP-%   Pso
2/5/2019    event-summary   100 300 6   Misc    Dir

【问题讨论】:

  • 该查询是否有效?你有两个FROM 子句
  • 这个是mysql还是spark?
  • @JerryM.:这是复制粘贴错误。
  • @PatrickSmith:我正在为 databricks 编写 sql 代码,所以它是 sparksql。
  • sparksql 的哪个版本? 1.2.0?

标签: sql apache-spark apache-spark-sql


【解决方案1】:

假设您使用的是newest version ofsparksql,您可以使用CASE...WHEN 语句

了解更多关于CASE...WHENhere

create table temp.Register

Select 
    date(date_time) as the_date, 
    post_evar10, 
    count(page_event) as Pageviews, 
    concat(post_visid_high, post_visid_low) as UniqueVisitors, 
    ref_type as Source_Traffic, 
    paid_search, 
    post_campaign,
    CASE
        WHEN post_campaign LIKE 'KNC-%' AND ref_type = 3 THEN 'PS'
        WHEN post_campaign IS NULL AND ref_type = 3 THEN 'OS'
        WHEN post_campaign LIKE 'SNP-%' THEN 'PSO'
        WHEN post_campaign LIKE 'SNO-%' AND ref_type = 9 THEN 'Opso'
        WHEN ref_type = 6 THEN 'Dir'
    ELSE NULL END AS Column_X
from 
    a_hits

where 
    ref_type in (3,6,7,9)
    and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
    and page_event like '0'
    and exclude_hit like '0'
    and hit_source not in (5,7,8,9)

group by 
    the_Date, 
    post_evar10, 
    UniqueVisitors, 
    Source_Traffic, 
    paid_search
;

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2016-10-01
    • 2013-11-04
    • 1970-01-01
    • 2020-03-22
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多