Oracle SQL 按具有不同 ID 的多维数据集分组答案

【问题标题】：Oracle SQL group by cube with distinct IDOracle SQL 按具有不同 ID 的多维数据集分组
【发布时间】：2025-12-06 22:05:01
【问题描述】：

示例数据（完整的表格有更多的列和数百万行）：

invoice_number |year            |department      |euros
-------------------------------------------------------------
1234           |2010            |1               | 200
1234           |2011            |1               | 200
1234           |2011            |2               | 200           
4567           |2010            |1               | 450
4567           |2010            |2               | 450
4567           |2010            |3               | 450

我的目标：

我想在每个可能的组合中汇总每年和每个部门的欧元。

结果应该是什么样子：

year             |department         |euros
--------------------------------------------
2010             |1                  |650
2010             |2                  |450
2010             |3                  |450
2010             |(null)             |650
2011             |1                  |200
2011             |2                  |200
(null)           |1                  |650
(null)           |2                  |650
(null)           |3                  |450
(null)           |(null)             |650

我的查询：

select      year
,           department
,           sum(euros)
from        table1
group by    cube    (
                    year
            ,       department
                    )

问题：

一个发票编号可以出现在多个类别中。例如，一张发票可以包含 2010 年和 2011 年的项目。当我想显示每年的数据时，这没问题。但是，当我想要所有年份的总和时，欧元将被计算两次，每年一次。我想要“按多维数据集分组”的功能，但我只想汇总不同的发票编号以进行聚合。

问题表：

year             |department         |euros
--------------------------------------------
2010             |1                  |650
2010             |2                  |450
2010             |3                  |450
2010             |(null)             |1550
2011             |1                  |200
2011             |2                  |200
(null)           |1                  |850
(null)           |2                  |650
(null)           |3                  |450
(null)           |(null)             |1950

有可能做我想做的事吗？到目前为止，我的搜索没有产生任何结果。我创建了一个SQL Fiddle，希望它可以工作

【问题讨论】：

标签： oracle group-by cube

【解决方案1】：

[删除了以前的“解决方案”]

新尝试：这是一个相当丑陋的解决方案，但它似乎有效，即使两张发票的金额相同。对于两个表访问，您应该检查性能是否可以接受。

SQL> with table1_cubed as
  2  ( select year
  3         , department
  4         , grouping_id(year,department) gid
  5      from table1
  6     group by cube(year,department)
  7  )
  8  , join_distinct_invoices as
  9  ( select distinct x.*
 10         , r.invoice_number
 11         , r.euros
 12      from table1_cubed x
 13           inner join table1 r on (nvl(x.year,r.year) = r.year and nvl(x.department,r.department) = r.department)
 14  )
 15  select year
 16       , department
 17       , sum(euros)
 18    from join_distinct_invoices
 19   group by year
 20       , department
 21       , gid
 22   order by year
 23       , department
 24  /

      YEAR DEPARTMENT           SUM(EUROS)
---------- -------------------- ----------
      2010 1                           650
      2010 2                           450
      2010 3                           450
      2010                             650
      2011 1                           200
      2011 2                           200
      2011                             200
           1                           650
           2                           650
           3                           450
                                       650

11 rows selected.

【讨论】：

当我添加更多包含更多数据的列时，性能会降低（显然）。当我添加所有列时，按多维数据集分组会生成 287.912 行。对于较小的组，我“仅”有 4.488 行。第二个有可接受的性能，第一个没有。但该解决方案似乎确实有效。

【解决方案2】：

select year
      ,department
      ,case when GROUPING_id(year,department) in (3) then sum(dist_euro) else sum(euros) end sums
      ,decode(GROUPING_id(year,department),0,'NO GROUP',1,'DEPARTMENT IS NULL',2,'YEAR IS NULL',3,'TOTAL OVER ALL YEARS') info
      from (
select      year
            ,           department
            ,           euros
            ,case when  row_number() over(partition by year order by year) = 1 then euros else  0 end dist_euro
from table1)
group by    cube    (
                    year
            ,       department
                    )
        order by GROUPING_id(year,department)

【讨论】：

如果我只想要“年份”列的总和，这将起作用。但是，我希望每一列（因此在此示例中也适用于部门）和我的其他 7 个列的真实数据集中都可以做到这一点。根据您的查询，部门中 (null) 的总和仍然不正确。