SQL Server 根据最小日期和其他两列选择多行答案

【问题标题】：SQL Server Selecting Multiple Rows based on minimum dates and two other columnsSQL Server 根据最小日期和其他两列选择多行
【发布时间】：2018-06-27 15:24:51
【问题描述】：

更新：这是我的实际查询：如果我在内部选择（注释掉的 where 子句）中设置了一个员工 ID，这个查询就可以完美运行。但是，如果我取出该员工 ID 以尝试为所有员工运行此 ID，它会选择（几乎）该员工的所有日期。

SELECT* FROM ( 
    SELECT id, shop_id, local_id, employee_id, invoice_date, 
    LAG(shop_id)OVER(ORDER BY invoice_date) as new_Col1, 
    LAG(local_id)OVER(ORDER BY invoice_date) as new_Col2 
    FROM tmpMemPayment 
) as A 
WHERE NOT( (shop_id = ISNULL(new_Col1,'')) 
           and (local_id = ISNULL(new_Col2,'')) )

----结束更新

我正在尝试一个复杂的查询，可以接近但不够接近。我已经尝试过分区，但仍然没有乐趣。

我正在尝试什么：

我的表： Col1 Col2 日期

x        y      1/1/2018
x        y      1/1/2017
x        z      1/1/2016
x        y      1/1/2015
a        b      1/1/2014
a        b      1/1/2013
x        y      1/1/2012

我需要每个分区内的每一行都有最小日期，其中一个分区由 col1 和 col2 定义，对于一组连续日期是相同的即我需要的结果集是：

x    y    1/1/2017
x    z    1/1/2016
x    y    1/1/2015
a    b    1/1/2013
x    y    1/1/2012

我得到的最接近的是：

select t1.col1, t1.col2, min(t1.date)
from
MyTable t1
and t1.Date < 
(
select max(t2.Date) from MyTable t2 where 
(t2.col1 !=t1.col1 or t2.col2 != t1.col2)
)
group by col1, col2

union 


select t1.col1, t1.col2, min(t1.date)
from
MyTable t1
and t1.Date > 
(
select min(t2.Date) from MyTable t2 
where (t2.col1 !=t1.col1 or t2.col2 != t1.col2)
)
group by col1, col2

【问题讨论】：

你想要的逻辑是什么？为什么第 1 行和第 5 行不在您的结果集中？
啊，感谢您的提问，抱歉，不清楚。我需要每个分区中具有最小日期的每一行，其中一个分区由 col1 和 col2 定义，对于一组连续日期是相同的。
你想做什么？？？？在不告诉我们您想要什么的情况下发布损坏的查询不会帮助我们帮助您！！！
抱歉，GuidoG 在这里也正确地提醒了我。我已经编辑了。
@user621713 只是好奇你是否检查了我的答案，是否有任何问题？

标签： sql sql-server

【解决方案1】：

您可以使用LEAD() 函数将当前行的值与下一行的值进行比较：

;with cte as (select 
                    *
                  , LEAD(Col1, 1) over (ORDER BY DT desc) as Col1_Next 
                  , LEAD(Col2, 1) over (ORDER BY DT desc) as Col2_Next 
              from MyTable)

select Col1, Col2, DT from cte
where Col1 <> ISNULL(Col1_Next, '') OR Col2 <> ISNULL(Col2_Next, '')

此处的工作示例：SQLFiddle

更多关于LEAD()函数的信息

【讨论】：

【解决方案2】：

试试这个：

WITH CTE
AS
(
SELECT 'x' as Col1, 'y' as Col2, '1/1/2018' as Date
UNION ALL
SELECT 'x' as Col1, 'y' as Col2, '1/1/2017' as Date
UNION ALL
SELECT 'x' as Col1, 'z' as Col2, '1/1/2016' as Date
UNION ALL
SELECT 'x' as Col1, 'y' as Col2, '1/1/2015' as Date
UNION ALL
SELECT 'a' as Col1, 'b' as Col2, '1/1/2014' as Date
UNION ALL
SELECT 'a' as Col1, 'b' as Col2, '1/1/2013' as Date
UNION ALL
SELECT 'x' as Col1, 'y' as Col2, '1/1/2012' as Date

)
--Actual query  
SELECT col1,col2,Date FROM 
(
   SELECT col1, col2, [Date], LAG(Col1)OVER(ORDER BY [DATE]) as new_Col1, LAG(Col2)OVER(ORDER BY [DATE]) as new_Col2
   FROM CTE
) as A
WHERE NOT(
          (col1 = ISNULL(new_Col1,'')) and (col2 = ISNULL(new_Col2,''))
         )

【讨论】：

所以在实际示例中，我有一个员工表的外键以及每个员工的许多这些组合和日期。如果我将员工 ID 传递到内部选择中，则效果很好。如果我在整个表格中运行它，它会返回几乎所有的组合/日期。我尝试在游标中运行它，但超过 5200 万条记录表只需要几天时间。
这是我的实际查询：如果我在内部选择（注释掉的 where 子句）中设置了员工 ID，则此查询可以完美运行。但是，如果我取出该员工 ID 以尝试为所有员工运行此 ID，它会选择（几乎）该员工的所有日期。 SELECT* FROM (SELECT id, shop_id, local_id, employee_id, invoice_date, LAG(shop_id)OVER(ORDER BY invoice_date) as new_Col1, LAG(local_id)OVER(ORDER BY invoice_date) as new_Col2 FROM tmpMemPayment) as A WHERE NOT( (shop_id = ISNULL(new_Col1,'')) 和 (local_id = ISNULL(new_Col2,'')) )

【解决方案3】：

我只会使用lag()。这是一种方法：

select col1, col2, date
from (select t.*,
             lag(date) over (order by date) as prev_date,
             lag(date) over (partition by col1, col2 order by date) as prev_date_2
      from t
     ) t
where prev_date is null or prev_date <> prev_date_2;

与在列上使用lag()s 相比，此方法的优势在于它可以轻松扩展到任意数量的列——它们只是放在partition by 子句中，而不是在where 子句。

【讨论】：