【发布时间】:2021-12-30 09:59:51
【问题描述】:
我在Redshift 中尝试了两种类型的加入条件,首先我在join on 之后尝试了where,第二,我在join on 之后尝试了and。我假设where 在join 之后执行,所以在这种情况下,它必须扫描这么多行。
explain
select
*
from
table1 t
left join table2 t2 on t.key = t2.key
where
t.snapshot_day = to_date('2021-12-18', 'YYYY-MM-DD');
XN Hash Right Join DS_DIST_INNER (cost=43055.58..114637511640937.91 rows=2906695 width=3169)
Inner Dist Key: t.key
Hash Cond: (("outer".asin)::text = ("inner".asin)::text)
-> XN Seq Scan on table2 t2 (cost=0.00..39874539.52 rows=3987453952 width=3038)
-> XN Hash (cost=35879.65..35879.65 rows=2870373 width=131)
-> XN Seq Scan on table1 t (cost=0.00..35879.65 rows=2870373 width=131)
Filter: (snapshot_day = '2021-12-18 00:00:00'::timestamp without time zone)
另一方面,如下所述,and 在join 之前被限定,因此我假设在join 中扫描的行数较少。但它返回的行数太多,消耗的成本比where子句要大
explain
select
*
from
table1 t
left join table2 t2 on t.key= t2.key
and
t.snapshot_day = to_date('2021-12-18', 'YYYY-MM-DD');
XN Hash Right Join DS_DIST_INNER (cost=40860915.20..380935317239623.75 rows=3268873216 width=3169)
Inner Dist Key: t.key
Hash Cond: (("outer".key)::text = ("inner".key)::text)
Join Filter: ("inner".snapshot_day = '2021-12-18 00:00:00'::timestamp without time zone)
-> XN Seq Scan on table2 t2 (cost=0.00..39874539.52 rows=3987453952 width=3038)
-> XN Hash (cost=32688732.16..32688732.16 rows=3268873216 width=131)
-> XN Seq Scan on table1 t (cost=0.00..32688732.16 rows=3268873216 width=131)
它们之间有什么区别?在这种情况下我在哪里误解? 如果有人有意见或材料,请告诉我
谢谢
【问题讨论】:
标签: amazon-web-services amazon-redshift