通过添加其他两个表的列在 Redshift 中创建一个表答案

【问题标题】：create a table in Redshift by adding columns of the other two tables通过添加其他两个表的列在 Redshift 中创建一个表
【发布时间】：2021-01-01 00:14:28
【问题描述】：

我想通过添加其他两个表的列在 Redshift 中创建一个表。

表1

表2

想在以下条件下创建新表

如果 table1.sid = table2.sid
然后 t1.totalcorrect+t2.totalcorrect，t1.totalquestions+t2.totalquestions。即 s4 到 s7
两个表中的其他数据原样

预期输出

使用连接结果表只给我 S4 到 S7 而不是其他所需的列。请帮帮我

【问题讨论】：

标签： sql join sum amazon-redshift full-outer-join

【解决方案1】：

那是full join:

select 
    coalesce(t1.sid, t2.sid) sid, 
    coalesce(t1.totalcorrect,   0) + coalesce(t2.totalcorrect,   0) totalcorrect,
    coalesce(t1.totalquestions, 0) + coalesce(t2.totalquestions, 0) totalquestions
from t1 
full join t2 on t2.sid = t1.sid

【讨论】：

【解决方案2】：

有两种方法可以做到这一点，我不确定在 Redshift 中哪种方法更快。一种是union all和group by：

select sid, sum(totalcorrect) as totalcorrect, sum(totalquestions) as totalquestions
from ((select sid, totalcorrect, totalquestions
       from t1
      ) union all
      (select sid, totalcorrect, totalquestions
       from t2
      )
     ) t
group by sid;

第二个使用full join，为此我推荐使用using子句：

select sid,
       coalesce(t1.totalcorrect, 0) + coalesce(t2.totalcorrect, 0) as totalcorrect,
       coalesce(t1.totalquestions, 0) + coalesce(t2.totalquestions, 0) as totalquestions
from t1 full join
     t2
     using (sid);

这两种方法之间存在差异。第一个保证结果集中每个sid 有一行，即使其中一个表中有重复项。第一个还将NULL 的sid 值合并到一行中。

【讨论】：