应用排序的东西，不同的东西和排序答案

【问题标题】：apply sorting in stuff, stuff distinct and order by应用排序的东西，不同的东西和排序
【发布时间】：2019-06-17 07:31:24
【问题描述】：

我需要使用 STUFF 运算符将具有相同值的行合并为一个。我归档了行连接，但值的顺序不正确。

http://www.sqlfiddle.com/#!18/2f420/1/0

正如您在结果窗口中看到的，TypeB 在 TypeA 旁边，但 Amount 值的序列与 Type 值不对应，其中 0.09K 应该在 3k 旁边

如何制作不同值的 STUFF 并保存序列（按 rn 排序）？

create table SomeTable (Code int, Type nvarchar(50), Amount nvarchar(50), Date datetime);

insert into SomeTable VALUES(20, 'TypeA', '12k', cast('01/01/2019' as datetime));
insert into SomeTable VALUES(20, 'TypeA', '11k', cast('01/01/2018' as datetime));
insert into SomeTable VALUES(22, 'TypeA', '17k', cast('01/02/2017' as datetime));
insert into SomeTable VALUES(22, 'TypeA', '17k', cast('01/01/2017' as datetime));
insert into SomeTable VALUES(25, 'TypeB', '0.09k', cast('01/02/2019' as datetime));
insert into SomeTable VALUES(25, 'TypeA', '3k', cast('01/01/2019' as datetime));

with t as (

  select 
      row_number() over(partition by st.Code order by st.Date) as rn,
      st.Code, 
      st.Type, 
      st.Amount, 
      st.Date 
  from SomeTable st
)

select 
  t1.Code,
  stuff((select distinct ',' + t.Type from t 
         where t.Code = t1.Code
         for XML path('')), 1,1, '') as Type,
  stuff((select distinct ',' + t.Amount from t 
         where t.Code = t1.Code
         for XML path('')), 1,1, '') as Amount,
  t1.Date
from t as t1
where t1.rn = 1
order by t1.Date

【问题讨论】：

请出示你的sql
请关注链接sqlfiddle.com/#!18/2f420/1/0
我再次看到对这种连接模式的一些误解......连接是由 SELECT...FOR XML 子查询完成的。 STUFF 只是一个字符串函数，用于从结果中删除第一个分隔符。 ;)
使用DISTINCT 似乎不正确。如果您有 typea, 10k、typea, 11k、typeb, 11k 等数据，则结果将包含 typea,typeb | 10k,11k，但显然 10k 和 11k 不对应类型。
@Z.R.T.您确定要为此使用 STUFF，如果您正在使用 SSMS 2016+，那么您可以使用 String_agg 函数

标签： sql sql-server tsql sql-server-2008

【解决方案1】：

通过使用order by 子句对 SQL 中的任何结果应用排序。
不是 stuff 函数让您感到困难，而是您没有为子查询指定 order by。
如果没有 and order by 子句，子查询会以任意顺序返回记录 - 这就是您得到现在得到的结果的原因。
但请注意，由于结果的顺序是任意的，下次运行查询时可能会得到不同的结果。

因此，您必须在生成 C.S.V 列的子查询中指定 order by 子句。

现在我不太确定您期望的顺序是什么，但可能是按 rn 或 rn desc 排序（根据图像，我认为它是第一个）。
但是，这里有一个技巧 - 因为您想要 type 和 amount 的不同值，所以不能简单地在 order by 子句中使用 rn - SQL Server 将引发以下错误：

如果指定了 SELECT DISTINCT，则 ORDER BY 项目必须出现在选择列表中。

因此，诀窍是不要使用distinct，而是使用group by，而不是在order by 子句中使用rn，而是使用max(rn)。这样，您会得到TypeA,TypeB 和3k,0.09k - 并且会保持一致。

说了这么多 - 这是您的代码的修订版本（cte 保持不变）：

select  t1.Code,
  stuff((
            select ','+ t.Type
            from t 
            where t.Code = t1.Code
            group by t.Type
            order by max(rn) 
            for XML path('')
        ), 1,1, '') as Type,
  stuff((select ',' + t.Amount 
         from t 
         where t.Code = t1.Code
         group by  t.Amount 
         order by max(rn) 
         for XML path('')), 1,1, '') as Amount,
  t1.Date
from t as t1
where t1.rn = 1
order by t1.Date

结果：

Code    Type            Amount      Date
22      TypeA           17k         2017-01-01
20      TypeA           11k,12k     2018-01-01
25      TypeA,TypeB     3k,0.09k    2019-01-01

注意 Salman's comment to the question（我现在才看到）有一个非常有效的观点 - distinct 在这里可能根本不是一个好的选择。
在您拥有TypeA, TypeB, TypeA 和相应金额10K, 11K, 12K 的情况下 -
如果您区分，结果将是TypeA, TypeB 和金额10K, 11K, 12K - 并且无法分辨哪个金额属于哪种类型。

【讨论】：

请注意我回答的最后一段以及萨尔曼的评论。这一点很重要。我很高兴能帮上忙。

【解决方案2】：

将 order by 放入 stuff 段：

 select 
  t1.Code,
  stuff((select distinct top 100 percent  ',' + t.Type from t 
         where t.Code = t1.Code order by ',' + t.Type 
         for XML path('')), 1,1, '') as Type,
  stuff((select distinct ',' + t.Amount from t 
         where t.Code = t1.Code
         for XML path('')), 1,1, '') as Amount,
  t1.Date
from t as t1
where t1.rn = 1
order by t1.Date

【讨论】：

现在试试，改了一点
TypeA - 3k, TypeB - 0.09k 结果应该是 TypeA,TypeB 和 3k, 0.09k
由你来应用逻辑
@Z.R.T.在 SQL 中，根据需要匹配不同列中的链接 CSV 可能非常困难。如果你最终成功了，结果可能会变得难以阅读和匹配，尤其是当列中有许多 CSV（超过五个左右）时。我个人会考虑在这里切换到另一种数据检索/呈现策略，例如将它们拆分为单独的列，或者在客户端应用程序或报告中创建某种分组行（如果 CSV 可以非常动态）。

【解决方案3】：

这样试试怎么样？

with t as (
  select 
      row_number() over(partition by st.Code order by st.Date) as rn,
      st.Code, 
      st.Type, 
      st.Amount, 
      st.Date 
  from SomeTable st
),
t2 as (
  select distinct top 100 percent
      st.code,
      st.type,
      st.amount
  from SomeTable st
  order by st.code, st.type, st.amount
)

select 
  t1.Code,
  stuff((select distinct ',' + t2.Type from t2 
         where t2.Code = t1.Code
         for XML path('')), 1,1, '') as Type,
  stuff((select ',' + t2.Amount from t2 
         where t2.Code = t1.Code
         for XML path('')), 1,1, '') as Amount,
  t1.Date
from t as t1
where t1.rn = 1
order by t1.Date

【讨论】：

非常感谢您的回答，它正在按我的需要工作，但分组解决方案看起来更合适
这个答案是错误的。 top 100 percent...order by 不会影响从公用表表达式的查询返回的结果的顺序 - 因为数据库表本质上是无序的。这就是当您尝试在公用表表达式、视图和派生表上尝试使用 order by 而不使用 top（或 for xml/ for josn）时 SQL Server 引发错误的原因 - 不幸的是 T-SQL 解析器不是足够严格到使用top 100 percent“trick”的查询失败 - 但这并不意味着它会做你期望它做的事情。
有关更多信息，请阅读SQL: SELECT TOP 100 PERCENT is a code smell for SQL Server 以及 Aaron Bertrand 的 answer here 以及 Microsoft 查询优化器团队的博客文章 TOP 100 Percent ORDER BY Considered Harmful.