SAP HANA | With 子句性能答案

【问题标题】：SAP HANA | With Clause performanceSAP HANA | With 子句性能
【发布时间】：2018-05-18 06:23:42
【问题描述】：

我们正在使用 SAP HANA 1.0 SPS12。

我们有如下的按日表-

从 table_1 中选择 trans_date,article,measure1,measure2

表的容量 ~ 500 万行

我们需要查看类似的数据 -

select 'day-1',sum(measure1),sum(meaure2) from table1 where trans_date=add_days(current_date,-1) group by 'day-1'
union all
select 'day-2',sum(measure1),sum(meaure2) from table1 where trans_date>=add_days(current_date,-2) group by 'day-2' 
union all
select 'WTD',sum(measure1),sum(meaure2) from table1 where trans_date>=add_days(current_date,-7) group by 'WTD'
union all
select 'WTD-1',sum(measure1),sum(meaure2) from table1 where trans_date>=add_days(current_date,-15) and trans_Date <= add_days(current_date,-7) group by 'WTD-1'

MTD、MTD-1、MTD-2、YTD 以此类推。

在性能方面，使用 WITH CLAUSE 并将数据保存一年然后根据时间范围进行拆分会更好吗？或者如上所示为每个时间范围使用单独的聚合是否更好。

据我了解，在 Oracle 等 RDBMS 中，WITH CLAUSE 实现结果并从内存中使用它。 SAP HANA 是内存数据库本身。在 SAP HANA 中使用 WITH CLAUSE 是否会带来独特的性能优势？

使用 WITH CLAUSE 查询 -

WITH t1 as
(
select trans_date,sum(measure1),sum(meaure2) from table1 where trans_date>=add_days(current_date,-365)
)
select 'day-1',sum(measure1),sum(meaure2) from t1 where trans_date=add_days(current_date,-1) group by 'day-1'
union all
select 'day-2',sum(measure1),sum(meaure2) from t1 where trans_date>=add_days(current_date,-2) group by 'day-2' 
union all
select 'WTD',sum(measure1),sum(meaure2) from t1 where trans_date>=add_days(current_date,-7) group by 'WTD'
union all
select 'WTD-1',sum(measure1),sum(meaure2) from t1 where trans_date>=add_days(current_date,-15) 
                                                  and trans_Date <= add_days(current_date,-7) 
                                                  group by 'WTD-1'

【问题讨论】：

当您希望获得选择组名称所暗示的数据范围时，等式选择确实看起来很错误。 YTD：年初至今，MTD：月初至今。使用您在那里的点选择，您只能获得当天发生的交易，例如一个月前。
谢谢！...更正了。对 WITH CLAUSE 的表现有何评论？
如何使用 WITH 子句编写查询？您可以将其添加到问题中吗？
我添加了带有“WITH CLAUSE”的查询
对于此查询，HANA 查询优化器会重写语句，使其等于 UNION ALL 情况。在这种情况下，公用表表达式的结果不会具体化。

标签： sql sap hana

【解决方案1】：

如果您关心性能，将数据放在一行中应该会更好：

select sum(case when trans_date = add_days(current_date, -1) then measure1 end) as measure1_day1,
       sum(case when trans_date = add_days(current_date, -1) then measure2 end) as measure2_day1,
       sum(case when trans_date = add_days(current_date, -2) then measure1 end) as measure1_day2,
       sum(case when trans_date = add_days(current_date, -2) then measure2 end) as measure2_day2,
       . . .       
from table1
where trans_date >= add_days(current_date, -15);

如果您确实需要单独的行中的值，您可以在之后取消透视结果。

或者，您可以这样做：

select days, sum(measure1), sum(measure2)
from (select 1 as days from dummy union all
      select 2 from dummy union all
      select 7 from dummy union all
      select 15 from dummy
     ) d left join
     table1 t
     on t.trans_date = add_days(current_date, - d.days)
group by days
order by days;

【讨论】：

出于兴趣：为什么将数据放入单行（通过 case 子句）会更快？
@LarsBr。 . . .聚合中的行数对性能有很大影响。更多的行将比另一个简单的计算更昂贵。
谢谢，@gordon-linoff ！我刚刚在 HANA 上的测试数据集上进行了尝试，但无法得出相同的结论。举个例子。 UNION ALL 版本平均。 15ms ，SUM(CASE) 版本 31ms (2x) 和左连接语句 165ms (101x)。这并不是说运行此查询的一种方式总是比另一种更好/更差，但当性能很重要时，衡量备选方案至关重要。
@LarsBr。 . . .试试我刚刚添加的带有where 子句的版本。这可能比union all 更快。