【问题标题】:Why Bigquery is giving me result in two rows where I am expecting to have one row为什么 Bigquery 给我的结果是两行,而我期望有一行
【发布时间】:2019-09-12 01:10:16
【问题描述】:

我正在使用完全外连接来检查对应于每月数据的哪个表有这个 SKU。示例中的 SKU 是 560715760。 在这种特殊情况下,SKU 在从 1 月到 9 月的所有月份都存在,除了 8 月。 我期望此查询的输出为一行,其中对应于 8 月 (H.SKU SKU_H) 的 SKU 列为空。 但实际上它给了我两行,一行所有月份都为空,但 Sep 一行不为空,另一行包含 SKU 的所有列,除了对应于 8 月和 9 月的表。请帮助理解案例中的完整外连接功能。

我观察到,当从查询中删除八月(不存在 SKU)时,它会在输出中给我一行。我怀疑它与来自 August 月表的 null 值有关。

SELECT A.SKU SKU_A,B.SKU SKU_B,C.SKU SKU_C,D.SKU SKU_D,E.SKU SKU_E,F.SKU SKU_F,G.SKU SKU_G,H.SKU SKU_H,I.SKU SKU_I
  FROM  (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH1_JAN` WHERE SKU = 560715760) A --NOT NULL
       FULL OUTER JOIN (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH2_FEB` WHERE SKU = 560715760) B ON (A.SKU = B.SKU) --NOT NULL
       FULL OUTER JOIN (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH3_MAR` WHERE SKU = 560715760) C ON (A.SKU = C.SKU AND B.SKU = C.SKU) --NOT NULL
       FULL OUTER JOIN (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH4_APR` WHERE SKU = 560715760) D ON (A.SKU = D.SKU AND B.SKU = D.SKU AND C.SKU = D.SKU)  --NOT NULL
       FULL OUTER JOIN (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH5_MAY` WHERE SKU = 560715760) E ON (A.SKU = E.SKU AND B.SKU = E.SKU AND C.SKU = E.SKU AND D.SKU = E.SKU) --NOT NULL
       FULL OUTER JOIN (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH6_JUN` WHERE SKU = 560715760) F ON (A.SKU = F.SKU AND B.SKU = F.SKU AND C.SKU = F.SKU AND D.SKU = F.SKU AND E.SKU = F.SKU)  --NOT NULL
       FULL OUTER JOIN (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH7_JUL` WHERE SKU = 560715760) G ON (A.SKU = G.SKU AND B.SKU = G.SKU AND C.SKU = G.SKU AND D.SKU = G.SKU AND E.SKU = G.SKU AND F.SKU = G.SKU)  --NOT NULL
       FULL OUTER JOIN (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH8_AUG` WHERE SKU = 560715760) H ON (A.SKU = H.SKU AND B.SKU = H.SKU AND C.SKU = H.SKU AND D.SKU = H.SKU AND E.SKU = H.SKU AND F.SKU = H.SKU AND G.SKU = H.SKU)  --NULL
       FULL OUTER JOIN (SELECT * FROM `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH9_SEP` WHERE SKU = 560715760) I ON (A.SKU = I.SKU AND B.SKU = I.SKU AND C.SKU = I.SKU AND D.SKU = I.SKU AND E.SKU = I.SKU AND F.SKU = I.SKU AND G.SKU = I.SKU AND H.SKU = I.SKU) --NOT NULL

预期结果: SKU_A,SKU_B,SKU_C,SKU_D,SKU_E,SKU_F,SKU_G,SKU_H,SKU_I 560715760,560715760,560715760,560715760,560715760,560715760,560715760,,560715760

实际结果: SKU_A,SKU_B,SKU_C,SKU_D,SKU_E,SKU_F,SKU_G,SKU_H,SKU_I , , , , , , , ,560715760 560715760,560715760,560715760,560715760,560715760,560715760,560715760,,

【问题讨论】:

  • 我认为是加入条件导致了这个问题,请帮我指出我需要修改的地方

标签: google-bigquery full-outer-join


【解决方案1】:

据我了解,您想查看给定月份中的哪些月份出现了您想要的 SKU。鉴于您的表结构,您可以在 BigQuery 中使用更简单的通配符方法来实现它(使用其特殊的 _TABLE_SUFFIX 元列):

select 
   cast(_TABLE_SUFFIX as string) as month, 
   if(sum(if(SKU = 560715760, 1, 0)) > 0, true, false) as sku_present
from `microstrategy-test-env.ZZ_ROCHIT_MARCUS_SANDPIT.MTH*` 
group by 1
order by 1

希望对你有帮助。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2014-07-30
    • 2020-09-26
    • 2016-08-06
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-10-30
    相关资源
    最近更新 更多