根据当前的详细程度很难回答这个问题,但如果总是从较大的值中减去较小的值(并且两者都不会为空),您可以使用 GROUP BY 以这种方式处理它:
SELECT
id,
MAX(value) - MIN(value) AS new_value
FROM
`your-project.your_dataset.your_table`
GROUP BY
id
从这里,您可以将这些结果保存为新表,或将此查询保存为视图定义(这类似于在基础数据发生变化时动态计算)。
另一种选择是在表架构下添加一列,然后运行UPDATE 查询来填充它。
如果较小的值并不总是从较大的值中减去,而是较小的日期才是最重要的(并且总是有两个),另一种方法是使用分析(或 window em>) 函数选择日期最小的值:
SELECT
DISTINCT
id,
(
FIRST_VALUE(value) OVER(PARTITION BY id ORDER BY yearmonth DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
-
LAST_VALUE(value) OVER(PARTITION BY id ORDER BY yearmonth DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
) AS new_value
FROM
`your-project.your_dataset.your_table`
因为分析函数对源行进行操作,所以需要DISTINCT 来消除重复行。
如果可能有两行以上,并且您需要从最新值中减去所有先前的值,您可以这样处理(这对于 NULL 或只有一行也是安全的):
SELECT
DISTINCT
id,
(
FIRST_VALUE(value) OVER(PARTITION BY id ORDER BY yearmonth DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
-
(
SUM(value) OVER(PARTITION BY id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
-
FIRST_VALUE(value) OVER(PARTITION BY id ORDER BY yearmonth DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
) AS new_value
FROM
`your-project.your_dataset.your_table`
从技术上讲,您可以通过分组和 ARRAY_AGG 取消引用来做同样的事情,尽管这种方法在较大的数据集上会明显变慢:
SELECT
id,
(
ARRAY_AGG(value ORDER BY yearmonth DESC)[OFFSET(0)]
-
(
SUM(value)
-
ARRAY_AGG(value ORDER BY yearmonth DESC)[OFFSET(0)]
)
) AS new_value
FROM
`your-project.your_dataset.your_table`
GROUP BY
id