具有不平衡数据的 SQL Pivot答案

【问题标题】：SQL Pivot with unbalanced data具有不平衡数据的 SQL Pivot
【发布时间】：2016-11-06 05:38:18
【问题描述】：

如果可以“填充”主表中未包含的数据，我正在考虑使用 Pivot 功能。我的表格包括这些数据：

create table tmpData (objID INT, colID varchar(5), value varchar(50));
insert into tmpData (objId, colId, value) values(21, 'col1', 'a value');
insert into tmpData (objId, colId, value) values(21, 'col2', 'col2_1');
insert into tmpData (objId, colId, value) values(21, 'col2', 'col2_2_x'); -- a second 'value' for col_2
insert into tmpData (objId, colId, value) values(21, 'col3', 'col3_1');
insert into tmpData (objId, colId, value) values(22, 'col1', 'another value');
insert into tmpData (objId, colId, value) values(22, 'col2', 'col2_2');
insert into tmpData (objId, colId, value) values(22, 'col3', 'col3_2');

使用枢轴功能

select
*
from (
select
  objID
, colID
, value
from tmpData)
 t
PIVOT (MAX(value) for colID in ([col1], [col2], [col3])) pivottable;

我在 col2 中只得到一个 objID=21 的（最大值）值：

objID col1          col2         col3
21   a value        col2_2_x     col3_1
22   another value  col2_2       col3_2

我想得到的是所有值并填充 col1 和 col3 中 objID=21 的非给定数据：

objID col1          col2        col3
21    a value       col2_2      col3_1
21    a value       col2_2_x    col3_1
22    another value col2_2      col3_2

这可以通过 Pivot 功能还是以其他方式实现？提前谢谢了约尔格

【问题讨论】：

您使用的是哪个 DBMS？语法类似于 SQL Server。
最后必须适用于 SQL Server 和 ORACLE

标签： sql sql-server tsql pivot aggregate

【解决方案1】：

您似乎（有点）想要列中的列表。如果你能接受这个结果：

objID col1          col2        col3
21    a value       col2_2      col3_1
21    NULL          col2_2_x    NULL
22    another value col2_2      col3_2

然后你可以通过枚举值来做到这一点：

select objId,
       max(case when colId = 'col1' then value end) as col1,
       max(case when colId = 'col2' then value end) as col2,
       max(case when colId = 'col3' then value end) as col3
from (select d.*,
             dense_rank() over (partition by objId, colId order by (select NULL)) as seqnum
      from tmpData d.*
     ) t
group by objId, seqnum;

在 SQL Server 2012+ 中，您可以使用累积的max() 来做您想做的事情：

select objId,
       max(max(case when colId = 'col1' then value end)) over (partition by objId order by seqnum) as col1,
       max(max(case when colId = 'col2' then value end)) over (partition by objId order by seqnum) as col2,
       max(max(case when colId = 'col3' then value end)) over (partition by objId order by seqnum) as col3
from (select d.*,
             dense_rank() over (partition by objId, colId order by (select value)) as seqnum
      from tmpData d.*
     ) t
group by objId, seqnum;

请注意，dense_rank() 中的 order by 已更改为按值显式排序。

【讨论】：

非常快速的回答。谢谢。第二个 SQL 也适用于 ORACLE :-)。您知道 SQL Server 2008 的解决方案吗？
我喜欢更新这个解决方案。累积的 max() 也适用于 ORACLE。但现实生活比上面的例子更复杂。在现实生活中，我必须处理 100 多列，而不仅仅是 3。在这种情况下，ORACLE 以 ORA-01467 结尾：排序键太长 :-( 但是，一旦你没有那么多列，它就可以正常工作.
@DickerXXL 。 . .必须使用 100 列进行排序似乎很可疑。

【解决方案2】：

使用 CTE 和左连接从 col1 和 col3 获取所有 col2 值的值

;with
t  as (select distinct objID    from #tmpData ),
t1 as (select objID, value col1 from #tmpData   where colID = 'col1'),
t2 as (select objID, value col2 from #tmpData   where colID = 'col2'),
t3 as (select objID, value col3 from #tmpData   where colID = 'col3')
select t.objID, col1, col2, col3
from t
left join t1 on t.objID = t1.objID
left join t2 on t.objID = t2.objID
left join t3 on t.objID = t3.objID
order by t.objID

如果您有更多 col1 或 col3 的值，这也将起作用

【讨论】：

我喜欢更新这个解决方案。 “自我加入”也适用于 ORACLE。但现实生活比上面的例子更复杂。在现实生活中，我必须处理 100 多个列，而不仅仅是 3 个。在这种情况下，ORACLE 消耗 > 100 GB 的临时表空间，并且在几个小时后不会得到结果 :-( 但是好的，一旦你没有这么多列，它工作正常。
@DickerXXL 你能告诉我们记录、列和不同 objID 的数量级吗？