【问题标题】:impala find_in_set vs in performanceimpala find_in_set 与性能对比
【发布时间】:2018-11-02 12:58:57
【问题描述】:

谁能告诉我 find_in_set() 和 in() 哪个性能更好?

SELECT a.data_date,
           lower(substr (a.cookie_id,-3,1)) cookie_type,
           CASE WHEN find_in_set (lower(substr (a.cookie_id,-3,1)),'2,3,5,6,8,b,c,d') > 0 THEN 'A' ELSE 'B'END 'AB',
           COUNT(a.cookie_id)
    FROM dw.dw_cookie_dau_visit a,  
    WHERE  a.data_date = '20181102'
    AND   a.site_id = 600
    AND   lower(substr(a.cookie_id,-1,1)) NOT IN ('e','f')
    AND   lower(substr(a.cookie_id,-3,1)) IN ('0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f')
    GROUP BY a.data_date,cookie_type,AB;

SELECT a.data_date,
           lower(substr (a.cookie_id,-3,1)) cookie_type,
           CASE WHEN lower(substr (a.cookie_id,-3,1) in ('2', '3', '5', '6', '8', 'b', 'c', 'd')   THEN 'A' ELSE 'B'END 'AB',
           COUNT(a.cookie_id)
    FROM dw.dw_cookie_dau_visit a,   
    WHERE a.data_date = '20181102'
    AND   a.site_id = 600
    AND   lower(substr(a.cookie_id,-1,1)) NOT IN ('e','f')
    AND   lower(substr(a.cookie_id,-3,1)) IN ('0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f')
    GROUP BY a.data_date,cookie_type,AB

我应该选择哪一个?

【问题讨论】:

    标签: mysql sql hive impala


    【解决方案1】:

    他们不做同样的事情。第二个版本应该是:

     (CASE WHEN lower(substr(a.cookie_id, -3, 1) in ('2', '3', '5', '6', '8', 'b', 'c', 'd')  THEN 'A' ELSE 'B' END) as AB,
    

    在我看来,这是编写逻辑的更好方法,因为它为此目的使用了特定的 SQL 操作数。

    至于性能,这无关紧要。查询的性能更多地取决于fromgroup by 子句,而不是select 中的case 表达式。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-04-10
      • 2011-09-22
      • 2012-02-07
      • 2018-06-29
      • 1970-01-01
      • 2020-01-16
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多