【问题标题】:FULL OUTER JOIN to merge tables with PostgreSQLFULL OUTER JOIN 将表与 PostgreSQL 合并
【发布时间】:2017-11-17 18:44:00
【问题描述】:

this post, 之后,当我将@Vao Tsun 给出的答案应用于更大的数据集时,我仍然遇到问题,这次由 4 个表而不是上面提到的相关帖子中的 2 个表组成。

这是我的数据集:

-- Table 'brcht' (empty)

insee  | annee  | nb
-------+--------+-----


-- Table 'cana'

insee  | annee  | nb
-------+--------+-----
036223 |   2017 |   1
086001 |   2016 |   2


-- Table 'font' (empty)

insee  | annee  | nb
-------+--------+-----


-- Table 'nr'

insee  | annee  | nb
-------+--------+-----
036223 |   2013 |   1
036223 |   2014 |   1
086001 |   2013 |   1
086001 |   2014 |   2
086001 |   2015 |   4
086001 |   2016 |   2

这里是查询:

SELECT
 COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
 COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
 COALESCE(brcht.nb,0) AS brcht,  
 COALESCE(cana.nb,0) AS cana,
 COALESCE(font.nb,0) AS font,
 COALESCE(nr.nb,0) AS nr,
 COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total

FROM public.brcht
  FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
  FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
  FULL OUTER JOIN public.nr   ON font.insee = nr.insee AND font.annee = nr.annee

ORDER BY COALESCE(brcht.insee, cana.insee, font.insee, nr.insee), COALESCE(brcht.annee, cana.annee, font.annee, nr.annee);

在结果中,insee='086001' 仍然有两行而不是一行(见下文)。我需要为每个insee 获取一行,在此示例中,两个2 值应位于同一行,total 列显示4 值。

再次感谢您的帮助!


以下是轻松创建上述表格的 SQL 脚本:

CREATE TABLE public.brcht (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.cana (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.font (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.nr (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);

INSERT INTO public.cana (insee, annee, nb) VALUES ('036223', 2017, 1), ('086001', 2016, 2);
INSERT INTO public.nr(insee, annee, nb) VALUES ('036223', 2013, 1), ('036223', 2014, 1), ('086001', 2013, 1), ('086001', 2014, 2), ('086001', 2015, 4), ('086001', 2016, 2);

【问题讨论】:

    标签: sql postgresql merge outer-join postgresql-9.4


    【解决方案1】:

    受到其他答案的启发,但可能组织得更好:

    SELECT *, 
           brcht + cana + font + nr AS total 
    FROM   (SELECT insee, 
                   annee, 
                   SUM(Coalesce(brcht.nb, 0)) brcht, 
                   SUM(Coalesce(cana.nb, 0))  cana, 
                   SUM(Coalesce(font.nb, 0))  font, 
                   SUM(Coalesce(nr.nb, 0))    nr 
            FROM   brcht 
                   full outer join cana USING (insee, annee) 
                   full outer join font USING (insee, annee) 
                   full outer join nr USING (insee, annee) 
            GROUP  BY insee, 
                      annee) t 
    ORDER  BY insee, 
              annee; 
    

    给予:

     insee  | annee | brcht | cana | font | nr | total 
    --------+-------+-------+------+------+----+-------
     036223 |  2013 |     0 |    0 |    0 |  1 |     1
     036223 |  2014 |     0 |    0 |    0 |  1 |     1
     036223 |  2017 |     0 |    1 |    0 |  0 |     1
     086001 |  2013 |     0 |    0 |    0 |  1 |     1
     086001 |  2014 |     0 |    0 |    0 |  2 |     2
     086001 |  2015 |     0 |    0 |    0 |  4 |     4
     086001 |  2016 |     0 |    2 |    0 |  2 |     4
    (7 rows)
    

    【讨论】:

    • 非常清楚,谢谢!不知道 USING 的连接语句。
    【解决方案2】:

    您需要在您现在使用的查询上对 bigint 列执行 GROUP BY 和 SUM()。

    select
        insee, annee
        , sum(brcht) brcht
        , sum(cana) cana
        , sum(font) font
        , sum(nr) nr
        , sum(total) total
    from (
        SELECT
         COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
         COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
         COALESCE(brcht.nb,0) AS brcht,  
         COALESCE(cana.nb,0) AS cana,
         COALESCE(font.nb,0) AS font,
         COALESCE(nr.nb,0) AS nr,
         COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
    
        FROM public.brcht
          FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
          FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
          FULL OUTER JOIN public.nr   ON font.insee = nr.insee AND font.annee = nr.annee
          ) d
    group by
        insee, annee
    

    【讨论】:

      【解决方案3】:

      尝试:

      t=# SELECT
       COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
       COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
       COALESCE(brcht.nb,0) AS brcht,
       COALESCE(cana.nb,0) AS cana,
       COALESCE(font.nb,0) AS font,
       COALESCE(nr.nb,0) AS nr,
       COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
      FROM public.brcht
        FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
        FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
        FULL OUTER JOIN public.nr   ON cana.insee = nr.insee AND cana.annee = nr.annee
      ORDER BY COALESCE(brcht.insee, cana.insee, font.insee, nr.insee), COALESCE(brcht.annee, cana.annee, font.annee, nr.annee);
       insee  | annee | brcht | cana | font | nr | total
      --------+-------+-------+------+------+----+-------
       036223 |  2013 |     0 |    0 |    0 |  1 |     1
       036223 |  2014 |     0 |    0 |    0 |  1 |     1
       036223 |  2017 |     0 |    1 |    0 |  0 |     1
       086001 |  2013 |     0 |    0 |    0 |  1 |     1
       086001 |  2014 |     0 |    0 |    0 |  2 |     2
       086001 |  2015 |     0 |    0 |    0 |  4 |     4
       086001 |  2016 |     0 |    2 |    0 |  2 |     4
      (7 rows)
      

      在您的示例中,您加入nr 对抗font,而您可能想加入它对抗cana?..

      也请在此处查看: https://www.postgresql.org/docs/current/static/queries-table-expressions.html#QUERIES-JOIN

      在没有括号的情况下,JOIN 子句从左到右嵌套

      更新

      解释逻辑: 尝试select * from public.brcht,添加其他表一,一 出现“更正确”表中的列,因此当您运行所有四个连接时,您会得到:

      t=# select * 
      FROM public.brcht 
      FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
      FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
      FULL OUTER JOIN public.nr   ON font.insee = nr.insee AND font.annee = nr.annee
      t-# ;
       insee | annee | nb | insee  | annee | nb | insee | annee | nb | insee  | annee | nb
      -------+-------+----+--------+-------+----+-------+-------+----+--------+-------+----
             |       |    | 036223 |  2017 |  1 |       |       |    |        |       |
             |       |    | 086001 |  2016 |  2 |       |       |    |        |       |
             |       |    |        |       |    |       |       |    | 036223 |  2013 |  1
             |       |    |        |       |    |       |       |    | 036223 |  2014 |  1
             |       |    |        |       |    |       |       |    | 086001 |  2013 |  1
             |       |    |        |       |    |       |       |    | 086001 |  2014 |  2
             |       |    |        |       |    |       |       |    | 086001 |  2015 |  4
             |       |    |        |       |    |       |       |    | 086001 |  2016 |  2
      (8 rows)
      

      所以第 8 列是 font.annee(请注意 - 它到处都是 null) - 你用 nr.insee 加入它 - 没有匹配 - 所以完全连接需要前三个表中的所有行加入和 nr 表中的所有行- 你得到 8 行

      【讨论】:

      • 你为什么要加入nr 对抗cana?我不明白加入 4 个表的方式...在我的示例中,我首先加入 brchtcana,然后加入 canafont,然后 fontnr。对我来说,这样进行似乎是合乎逻辑的。有没有一种合乎逻辑的方式将表格连接在一起?
      • @wiltomap 试图解释。请注意,如果您不使用 () 连接发生从左到右,那么最后一个连接将连接之前在 NULL 列上的整个集合 - 你从 (brcht,cana,font) 和所有来自 nr 获得所有内容(所有 - 因为它们没有共同点用于连接的列上的值)。希望这是有道理的 - 解释不是我最好的技能
      • 好的,我明白了,谢谢!问题是 4 个表的内容会定期更改,因此我无法继续根据此调整连接...我需要一种将表连接在一起的方法,以适应任何表的内容。
      • 然后使用括号嵌套连接,这样每个下一个连接都将在“合并”值上
      猜你喜欢
      • 1970-01-01
      • 2013-03-10
      • 1970-01-01
      • 2012-02-13
      • 1970-01-01
      • 1970-01-01
      • 2019-08-07
      • 2011-09-28
      • 2013-04-16
      相关资源
      最近更新 更多