SQL Server 数据透视查询替代或优化答案

【问题标题】：SQL Server pivot query alternative or optimizeSQL Server 数据透视查询替代或优化
【发布时间】：2017-01-21 14:59:10
【问题描述】：

所以我有这些表：

-- tbl_obs
id  lat  lon   created
-------------------------
1   1.2  -2.1  2002-08-03
2   1.9  -5.5  2002-08-03
3   1.5  -4.1  2002-08-03

-- tbl_obsdata
id  name         value     obs_id
---------------------------------
1   gender       Male       1
2   type         Type I     1
3   description  Some desc  1
4   gender       Female     2
5   type         Type II    2
6   description  Some desc  2
7   gender       Female     3
8   type         Type II    3
9   description  Some desc  3

我想要一个将两个表中的数据组合在一起的查询，如下所示：

lat  lon   created     gender  type  description
------------------------------------------------
1.2  -2.1  2002-08-03  Male   Type I  Some desc
1.9  -5.5  2002-08-03  Female Type I  Some desc
1.5  -4.1  2002-08-03  Male   Type II Some desc

我知道我可以使用如下支点来做到这一点：

with cte as (
 select obsdata.name, obsdata.value, obs.lat, obs.lon, obs.created
 from obsdata
 left join obs on obs.id = obsdata.obs_id
)
select lat, lon, created, gender, type, description
from cte
pivot(
 max(value)
 for [name] in (gender, type, description)
) as pvt

到目前为止，这会返回结果（我认为），但我有大约一百万行，而且运行速度非常慢。实现这一目标的任何替代方法会更快吗？我正在使用 SQL Server 2012。

【问题讨论】：

标签： sql sql-server tsql sql-server-2012 pivot

【解决方案1】：

另一种选择是

Select A.lat
      ,A.lon
      ,A.created
      ,gender      = max(IIF(B.name='gender',B.value,null))
      ,type        = max(IIF(B.name='type',B.value,null))
      ,description = max(IIF(B.name='description',B.value,null))
 From  tbl_obs A
 Join  tbl_obsdata B on (A.id=B.obs_id)
 Group By A.lat
      ,A.lon
      ,A.created

lat lon     created     gender  type    description
1.2 -2.1    2002-08-03  Male    Type I  Some desc
1.5 -4.1    2002-08-03  Female  Type II Some desc
1.9 -5.5    2002-08-03  Female  Type II Some desc

【讨论】：

【解决方案2】：

首先优化枢轴，然后是join。我认为 SQL Server 为数据透视做了合理的工作，所以从以下开始：

select obs_id,  gender, type, description
from tbl_obsdata
pivot (max(value) for [name] in (gender, type, description)
      ) as pvt;

然后，在tbl_obsdata(obs_id, name, value) 上创建一个索引。这应该相当快。

如果是，则加入其余部分：

with cte as (
      select obs_id,  gender, type, description
      from tbl_obsdata
      pivot (max(value) for [name] in (gender, type, description)
            ) as pvt
    )
select obs.lat, obs.lon, obs.created,
       cte.gender, cte.type, cte.description
from cte join
     obs
     on obs.id = cte.obs_id;

编辑：

我也想知道这会怎样：

select obs.lat, obs.lon, obs.created, od.gender, od.type, od.description
from obs cross apply
     (select max(case when name = 'gender' then value end) as gender,
             max(case when name = 'type' then value end) as type,
             max(case when name = 'description' then value end) as description

      from tbl_obsdata od
      where od.obs_id = obs.id
     ) od;

这也需要tbl_obsdata(obs_id, name, value) 上的索引。

【讨论】：

现在看起来很漂亮。通过查看这个问题，我认为 CTE 可能是缓慢的原因。你会更喜欢临时表（带索引）而不是 CTE 吗？
@BhatiaAshish 。 . .我更喜欢单个查询而不是临时表，除非后者是绝对必要的。
Pivots 存在性能问题，并且通常比 John Capalletti 在这里给出的解决方案要慢。 Unpivots 很好，但由于我在大容量大数据环境中工作，所以我从不使用数据透视。
@btberry 。 . .有趣的。我基本上从不使用枢轴，更喜欢条件聚合。我认为这是一个旧习惯，并且枢轴语法具有几乎相同的性能。您对枢轴的性能有参考吗？
@GordonLinoff 这里也一样。我喜欢称它为我的“穷人的支点”，但毕竟它似乎更好：sqlsunday.com/2016/01/29/pivot-unpivot-and-performance