【发布时间】:2018-07-10 19:29:14
【问题描述】:
我正在尝试合并两个数据集——一个是销售目标,另一个是实际销售额,按天和市场(美国/英国)。
为此,我使用了第三个表,该表使用GENERATE_DATE_ARRAY 创建要报告的日期的主列表 - 这样我就不会在没有设定目标和没有报告销售的情况下出现空白.
我发现我的销售额被计算了两次,因此已将我的数据和查询减少到可重现的状态:
#standardSQL
WITH dates AS (
SELECT day FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2018-07-05', '2018-07-09', INTERVAL 1 DAY)) AS day
),
targets AS (
SELECT DATE '2018-07-06' AS day, 'UK' AS Market, NUMERIC '2.4' AS quantity
UNION ALL SELECT '2018-07-06', "US", 8.4
UNION ALL SELECT '2018-07-06', "US", 1.2
UNION ALL SELECT '2018-07-08', "UK", 3.0
UNION ALL SELECT '2018-07-08', "US", 10.9
),
sales AS (
SELECT DATE '2018-07-08' AS day, 'UK' AS Market, 4 AS quantity
UNION ALL SELECT '2018-07-06', 'US', 15
)
SELECT
dates.day AS day,
targets.market AS market,
SUM(targets.quantity) AS targetQuantity,
SUM(sales.quantity) AS quantity
FROM dates
LEFT JOIN targets
ON dates.day = CAST(targets.day AS DATE)
LEFT JOIN sales
ON dates.day = CAST(sales.day AS DATE) AND targets.market = sales.market
GROUP BY day, market
ORDER BY day, market
这给出了以下结果:
结果显示,7 月 6 日(第 3 行)报告的销售量为 30,尽管数据中为 15。
当targets 数据中有两行该日期和市场时,就会发生这种情况,但我不知道如何为此编码。
感谢您的帮助!
【问题讨论】:
标签: sql database join google-bigquery