【问题标题】:Google Big Query - Concatenate Year RangesGoogle Big Query - 连接年份范围
【发布时间】:2026-01-14 22:00:01
【问题描述】:

我正在尝试通过对年份数据进行分组来整合与车辆装配相关的大型数据集。例如,我们数据库中的特定 SKU 可能适合 2012 现代伊兰特 GLS。相同的 SKU 也可能适合相同的车辆,但在 2013 年、2014 年和 2015 年。使用非常小的数据集,以下查询实现了我正在寻找的...:

SELECT
sku,
CASE
  WHEN MIN(YEAR) = MAX(YEAR) THEN MIN(YEAR)
  ELSE CONCAT(MIN(YEAR), '-', MAX(YEAR))
 END AS YEAR,
 make, model, submodel, notes
FROM
(SELECT @ldfnr:= IF((@old_make = tab.make
  AND @old_model = tab.model
  AND @old_submodel = tab.submodel
  AND @old_notes = tab.notes
  AND (@old_year = tab.`year`
  OR @old_year = tab.`year`-1)) , @ldfnr, @ldfnr+1) AS nr, tab.* ,
  @old_make := tab.make , @old_model := tab.model ,
  @old_submodel := tab.submodel , @old_notes := tab.notes ,
  @old_year := tab.`year`
FROM tableName AS tab,
  (SELECT @ldfnr:=0, @old_model:='', @old_submodel:='', @old_notes:='', @old_year:='', @old_make:=''  ) AS tmp
ORDER BY make, model, submodel, notes, `YEAR` ASC) AS mytab
GROUP BY nr
ORDER BY nr;

但是,我们的数据集非常大。出于这个原因,我试图将数据加载到 Google BigQuery 中,并在那里执行相同的查询。也许这是 Google BigQuery 的限制,但它一直返回与第 9 行第 2 列相关的错误。这是可以找到辅助 SELECT 查询的地方。

我在SQLFiddle 上有一些我们的示例数据供参考。

我正在考虑使用 AWS 来完成这项工作,但我想先在这里尝试一下。我很感激你的时间。 :-)

在下面编辑...:

下面是数据现在的样子...:

+------+------+-----------+-------+----------+------------------------------------------+
| SKU  | Year |   Make    | Model | Submodel |                  Notes                   |
+------+------+-----------+-------+----------+------------------------------------------+
| 0001 | 1995 | Chevrolet | Astro | Base     | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1995 | Chevrolet | Astro | CL       | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1995 | Chevrolet | Astro | LS       | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1996 | Chevrolet | Astro | Base     | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1996 | Chevrolet | Astro | CL       | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1996 | Chevrolet | Astro | LS       | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1997 | Chevrolet | Astro | Base     | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1997 | Chevrolet | Astro | LT       | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 2001 | Chevrolet | Astro | Base     | Clear Lens; Chrome Housing; Pair; 1 pc.; |
+------+------+-----------+-------+----------+------------------------------------------+

下面是想要的结果:

+------+-------------+-----------+-------+----------+------------------------------------------+
| SKU  |    Year     |   Make    | Model | Submodel |                  Notes                   |
+------+-------------+-----------+-------+----------+------------------------------------------+
| 0001 | 1995 - 1997 | Chevrolet | Astro | Base     | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1995 - 1996 | Chevrolet | Astro | CL       | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1995 - 1996 | Chevrolet | Astro | LS       | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 1997        | Chevrolet | Astro | LT       | Clear Lens; Chrome Housing; Pair; 1 pc.; |
| 0001 | 2001        | Chevrolet | Astro | Base     | Clear Lens; Chrome Housing; Pair; 1 pc.; |
+------+-------------+-----------+-------+----------+------------------------------------------+

我很抱歉之前没有包括在内! :-)

【问题讨论】:

  • 编辑您的问题并提供样本数据所需的结果。没有理由认为 MySQL 特定的代码会在另一个数据库中运行,尤其是使用定制变量的代码。
  • @GordonLinoff 感谢您的提示!我当然是想把它包括在内,但它完全让我忘记了。非常感谢您的帮助。 :)

标签: mysql sql google-bigquery


【解决方案1】:

如果您只想连接年份范围,可以使用窗口函数更简单(且更便携)的方法:

select sku, make, model, submodel, notes,
       (case when min(year) = max(year) then min(year)
             else min(year) || '-' || max(year)
        end) as year
from (select qt.*,
             sum(case when qtprev.make is null then 1 else 0 end) over (partition by qt.make, qt.model, qt.notes, qt.submodel, qt.sku order b qt.year) as grp
      from `tint-world-aces-processing.aces_table.queryTest` qt left join
           `tint-world-aces-processing.aces_table.queryTest` qtprev
           on qt.make = qtprev.make and qt.model = qtprev.model and
              qt.notes = qtprev.notes and qt.submodel = qtprev.submodel and
              qt.sku = qtprev.sku and qt.year = qtprev.year + 1
     ) qt
group by sku, make, model, submodel, notes;

(注意对 StandardSQL 的细微更改。)

【讨论】:

  • 这当然更干净,我真的很感激!我试图立即运行它,但最终在 BigQuery 中出现错误。我收到的错误是......:在第 3 行第 36 列遇到“”max“”。期待:“END”。我没有最强的 SQL 技能,所以我不太确定我在这里缺少什么。
  • @BrianSchroeter。 . .它缺少连接运算符。
【解决方案2】:

以下是 BigQuery 标准 SQL 并且没有 JOIN

#standardSQL
WITH yourTable AS (
  SELECT 
    '0001' AS SKU, 1995 AS Year, 'Chevrolet' AS Make, 'Astro' AS Model, 'Base' AS Submodel, 
    'Clear Lens; Chrome Housing; Pair; 1 pc.;' AS Notes UNION ALL
  SELECT '0001', 1995, 'Chevrolet', 'Astro', 'CL', 'Clear Lens; Chrome Housing; Pair; 1 pc.;' UNION ALL
  SELECT '0001', 1995, 'Chevrolet', 'Astro', 'LS', 'Clear Lens; Chrome Housing; Pair; 1 pc.;' UNION ALL
  SELECT '0001', 1996, 'Chevrolet', 'Astro', 'Base', 'Clear Lens; Chrome Housing; Pair; 1 pc.;' UNION ALL
  SELECT '0001', 1996, 'Chevrolet', 'Astro', 'CL', 'Clear Lens; Chrome Housing; Pair; 1 pc.;' UNION ALL
  SELECT '0001', 1996, 'Chevrolet', 'Astro', 'LS', 'Clear Lens; Chrome Housing; Pair; 1 pc.;' UNION ALL
  SELECT '0001', 1997, 'Chevrolet', 'Astro', 'Base', 'Clear Lens; Chrome Housing; Pair; 1 pc.;' UNION ALL
  SELECT '0001', 1997, 'Chevrolet', 'Astro', 'LT', 'Clear Lens; Chrome Housing; Pair; 1 pc.;' UNION ALL
  SELECT '0001', 2001, 'Chevrolet', 'Astro', 'Base', 'Clear Lens; Chrome Housing; Pair; 1 pc.;'
)
SELECT SKU,
  IF(MIN(Year) = MAX(Year), 
    CAST(MIN(Year) AS STRING), 
    CONCAT(CAST(MIN(Year) AS STRING), ' - ', CAST(MAX(Year) AS STRING))
  ) AS Year, 
  Make, Model, Submodel, Notes
FROM (
  SELECT SKU, Year, Make, Model, Submodel, Notes, 
    SUM(Step) OVER(PARTITION BY SKU, Make, Model, Submodel, Notes ORDER BY Year) AS grp
  FROM (
    SELECT SKU, Year, Make, Model, Submodel, Notes, 
      IFNULL(SIGN(Year - 1 - LAG(Year) OVER(PARTITION BY SKU, Make, Model, Submodel, Notes ORDER BY Year)), 1) AS Step
    FROM yourTable  
  )
)
GROUP BY SKU, Make, Model, Submodel, Notes, grp

【讨论】: