【问题标题】:Building an Employee Date Matrix构建员工日期矩阵
【发布时间】:2020-03-09 19:18:38
【问题描述】:

我有一个员工列表,以及他们工作过的城市列表。我需要按城市构建一个开始/结束日期的矩阵(在 SQL Server 中),以确定他们在任何给定时间段内的位置.

“结束日期”正好是他们出现在新位置之前的日期。

我已经包含了源表的示例、输出表应显示的内容以及构建临时表的代码。

关于如何设计此查询以及使用哪些函数有什么建议吗?

来源:

EMP     DATE        LOCATION
-----------------------------------
Pinal   2020-01-01  Bangalore
Pinal   2020-01-02  Bangalore
Pinal   2020-01-04  Uttar Pradesh
Pinal   2020-01-06  Uttar Pradesh
Pinal   2020-01-20  Mumbai
Pinal   2020-01-22  Bangalore

所需的查询输出:

EMP     DATE_FROM   DATE_TO     LOCATION
----------------------------------------------
Pinal   2020-01-01  2020-01-03  Bangalore
Pinal   2020-01-04  2020-01-19  Uttar Pradesh
Pinal   2020-01-20  2020-01-21  Mumbai
Pinal   2020-01-22  2099-01-01  Bangalore


CREATE TABLE #EMP 
(
    EMP VARCHAR(30) NOT NULL  ,
    DATE_WORKED DATE NOT NULL ,
    CITY VARCHAR(30) NOT NULL
);

INSERT INTO #EMP (EMP, DATE_WORKED, CITY) 
VALUES 
('Pinal','2020-01-01','Bangalore'),
('Pinal','2020-01-02','Bangalore'),
('Pinal','2020-01-04','Uttar Pradesh'),
('Pinal','2020-01-06','Uttar Pradesh'),
('Pinal','2020-01-20','Mumbai'),
('Pinal','2020-01-22','Bangalore')

【问题讨论】:

    标签: sql sql-server tsql sql-server-2008 matrix


    【解决方案1】:

    这是一个经典的峡岛。

    这里我们使用CROSS APPLY B 来获取日期范围,然后使用CROSS APPLY C 作为临时计数表

    示例

     Select Emp
          ,FromDate = min(D)
          ,ToDate   = max(D)
          ,City
     From (
    Select *
          ,Grp = datediff(day,'1900-01-01',d) - row_number() over (partition by Emp,City Order By D)
     From  #Emp A
     Cross Apply (
                   Select NextDate = IsNull(min(DateAdd(DAY,-1,Date_Worked)),'2025-01-01')
                    From  #Emp 
                    Where Emp=A.Emp and Date_Worked>A.Date_Worked
                 ) B
     Cross Apply (
                   Select Top (DateDiff(DAY,Date_Worked,NextDate)+1) 
                          D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),Date_Worked) 
                    From  master..spt_values n1,master..spt_values n2
                 ) C
     --Where EMP ='Victor'
    ) A
    Group By Emp,City,Grp
    Order by Emp,FromDate
    

    退货

    Emp     FromDate    ToDate      City
    Pinal   2020-01-01  2020-01-03  Bangalore
    Pinal   2020-01-04  2020-01-19  Uttar Pradesh
    Pinal   2020-01-20  2020-01-21  Mumbai
    Pinal   2020-01-22  2025-01-01  Banga
    Victor  2020-01-01  2020-01-18  NYC
    Victor  2020-01-19  2020-01-24  San Fran
    Victor  2020-01-25  2025-01-01  NYC
    

    【讨论】:

    • 谢谢约翰!但是,当您添加员工时,它会为连续工作的城市生成多条记录:CREATE TABLE #EMP (EMP VARCHAR(30), DATE_WORKED DATE, CITY VARCHAR(30));插入#EMP(EMP,DATE_WORKED,CITY)值('Pinal','2020-01-01','Bangalore'),('Pinal','2020-01-02','Bangalore'),(' Pinal','2020-01-04','北方邦'), ('Pinal','2020-01-06','北方邦'), ('Pinal','2020-01-20',' Mumbai'), ('Pinal','2020-01-22','Banga'), ('Pinal','2020-02-22','Banga'), ('Victor','2020-01- 01','NYC'), ('Victor','2020-01-12','NYC'), ('Victor','2020-01-19','San Fran')
    • 我回家后看看。 1 小时左右。
    • @DepthofField 已更正(我认为)。在 GRP 计算中缺少一个分区
    • 嘿约翰!再次感谢您再看一眼!但现在有别的东西了。如果您将此值添加到表中,那么您会明白我的意思:('Victor','2020-01-25','NYC')
    • @DepthofField 有时在处理小样本时,您可能会得到误报。我想这次我明白了:)
    【解决方案2】:

    你可以像这样使用windows函数:

    with cte as
    (
    
    select distinct EMP, 
    case when city = lag(CITY)over(partition by EMP order by DATE_WORKED) 
    then lag(DATE_WORKED)over(partition by EMP order by DATE_WORKED) else 
     DATE_WORKED end as DATE_FROM, CITY 
     from 
    #EMP a --order by DATE_WORKED
    )
    select EMP, CITY as LOCATION, DATE_FROM, 
    case when dateadd(d,-1,lead(DATE_FROM)over(order by DATE_FROM)) is null 
    then '2099-01-01' else dateadd(d,-1,lead(DATE_FROM)over(order by DATE_FROM)) 
     end as DATE_TO 
     from cte a 
    

    输出:

    EMP LOCATION            DATE_FROM   DATE_TO
    Pinal   Bangalore       2020-01-01  2020-01-03
    Pinal   Uttar Pradesh   2020-01-04  2020-01-19
    Pinal   Mumbai          2020-01-20  2020-01-21
    Pinal   Bangalore       2020-01-22  2099-01-01
    

    【讨论】:

      【解决方案3】:

      您可以使用递归 cte 将连续的行组合在一起,然后使用前导函数获取下一个日期。

      将 row_number 添加到数据中:

      select a.*,
             row_number() over (Partition by Emp Order by date_worked) as rownum
      INTO #Emp1
      From #Emp a
      

      递归 cte 到组:

      IF object_ID ('tempdb.dbo.#Temp1') is not null DROP TABLE #Temp1
      
      DECLARE @int int=0
      
      ;WITH CTE as
      (
      SELECT *,1 as grp 
      FROM #Emp1
      WHERE rownum=1 
      
      UNION ALL
      
      SELECT t.*,
              CASE WHEN c.City=t.City then grp 
              else grp+1 end as grp
      FROM #Emp1 t 
      INNER JOIN CTE c
                  ON t.rownum=c.rownum+1 and t.Emp=c.Emp
      )
      
      SELECT * 
      INTO #temp1 
      FROM CTE
      OPTION (maxrecursion 0);
      

      获取结束日期的前导函数:

      SELECT emp,
             min(date_worked) as date_from,
             ISNULL(DATEADD(d,-1,lead(min(date_worked)) Over (Partition by emp Order by min(date_worked))),'01Jan2099') as date_to,
             City
      from #temp1
      Group by grp,emp,City
      

      希望这会给你想要的结果。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-12-22
        相关资源
        最近更新 更多