【问题标题】:Find gaps between date ranges for users in SQL Server 2008查找 SQL Server 2008 中用户的日期范围之间的差距
【发布时间】:2019-09-21 01:19:38
【问题描述】:

我需要知道用户记录了多少时间(天)以及未记录时间之间的间隔。在此表中,我只存储 id 及其开始和结束日期(分别为 ID、INI、FIN)。我已经设法通过根据行号对用户进行分组,然后将最新日志与以下日志进行比较,基于条件检测到三个记录上的差距,依此类推。

问题是我以前有 n 个日志的人,我不能写 n 个左连接和 n 个条件。我希望使我当前的代码更具可扩展性,更递归地检测这些差距,让用户更“易于理解”。

    CREATE TABLE [dbo].[baseRecurrentes](
    [ID] [nvarchar](8) NULL,
    [INI] [datetime] NULL,
    [FIN] [datetime] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA1', CAST(0x0000A9C800000000 AS DateTime), CAST(0x0000A9E600000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA1', CAST(0x0000A9E700000000 AS DateTime), CAST(0x0000AA0200000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA1', CAST(0x0000AA0300000000 AS DateTime), CAST(0x0000AA2100000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA1', CAST(0x0000AA2200000000 AS DateTime), CAST(0x0000AA3F00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA2', CAST(0x0000A9D600000000 AS DateTime), CAST(0x0000A9D900000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA2', CAST(0x0000A9EB00000000 AS DateTime), CAST(0x0000A9ED00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA2', CAST(0x0000A9F000000000 AS DateTime), CAST(0x0000A9F100000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA3', CAST(0x0000AA1A00000000 AS DateTime), CAST(0x0000AA5A00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA4', CAST(0x0000A9CA00000000 AS DateTime), CAST(0x0000A9CB00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A8DC00000000 AS DateTime), CAST(0x0000A8F100000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A8F200000000 AS DateTime), CAST(0x0000A90F00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A91000000000 AS DateTime), CAST(0x0000A92E00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A92F00000000 AS DateTime), CAST(0x0000A94D00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A94E00000000 AS DateTime), CAST(0x0000A96B00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A96C00000000 AS DateTime), CAST(0x0000A98A00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A98B00000000 AS DateTime), CAST(0x0000A9A800000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A9A900000000 AS DateTime), CAST(0x0000A9C700000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A9C800000000 AS DateTime), CAST(0x0000A87900000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000A9E700000000 AS DateTime), CAST(0x0000AA0200000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000AA0300000000 AS DateTime), CAST(0x0000AA2100000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000AA2200000000 AS DateTime), CAST(0x0000AA3F00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA5', CAST(0x0000AA4000000000 AS DateTime), CAST(0x0000AA5000000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA6', CAST(0x0000AA0900000000 AS DateTime), CAST(0x0000AA2900000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA7', CAST(0x0000A96C00000000 AS DateTime), CAST(0x0000A98A00000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA7', CAST(0x0000A98B00000000 AS DateTime), CAST(0x0000A9A800000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA7', CAST(0x0000A9A900000000 AS DateTime), CAST(0x0000A9C700000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA7', CAST(0x0000A85B00000000 AS DateTime), CAST(0x0000A87900000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA7', CAST(0x0000A9E700000000 AS DateTime), CAST(0x0000AA0200000000 AS DateTime))
INSERT [dbo].[baseRecurrentes] ([ID], [INI], [FIN]) VALUES (N'PERSONA7', CAST(0x0000AA0300000000 AS DateTime), CAST(0x0000AA2100000000 AS DateTime))

;WITH CTE AS (
    SELECT *, RN = ROW_NUMBER()OVER(PARTITION BY ID ORDER BY INI DESC)
    FROM BASERECURRENTES
), CTE2 AS (
    --
    SELECT DISTINCT T.ID,
    -- 
    CAST(T2.INI AS DATETIME) AS INI1, CAST(T2.FIN AS DATETIME) AS FIN1, 
    -- 
    CAST(T3.INI AS DATETIME) AS INI2, CAST(T3.FIN AS DATETIME) AS FIN2, 
    --
    CAST(T4.INI AS DATETIME) AS INI3, CAST(T4.FIN AS DATETIME) AS FIN3 
    --
    FROM CTE T
    LEFT JOIN CTE T2 ON T.ID = T2.ID AND T2.RN = 1 
    LEFT JOIN CTE T3 ON T.ID = T3.ID AND T3.RN = 2
    LEFT JOIN CTE T4 ON T.ID = T4.ID AND T4.RN = 3

), CTE3 AS (
    SELECT *, MSG = (CASE 
                                --NO GAPS ON 3 LOGS
                                WHEN (INI1 - 1 BETWEEN INI2 AND FIN2) AND (INI2 - 1 BETWEEN INI3 AND FIN3) THEN 'SEC2' 
                                --NO GAPS ON 2 LOGS
                                WHEN (INI1 - 1 BETWEEN INI2 AND FIN2) THEN 'SEC1' 
                                --NO GAP AT ALL
                                ELSE 'NO SEC'
                            END)

    FROM CTE2
)
SELECT * FROM CTE3
ORDER BY ID ASC

我希望有一个表格显示用户 ID、“间隔天数”(未记录时间的总和)以及显示间隔位置的消息。

ID       GD  MSG
-------------------
PERSONA2 5   GAP ON X-Y

【问题讨论】:

  • 那么,对于您拥有的数据,预期的结果是什么?这听起来像是您在寻找日历或统计表。
  • @Larnu,向用户显示他们的记录中存在差距的东西,如果可能的话,差距在哪里。我有一个日历表,但将它与我当前的表配对并不能让它更容易工作。我对这个要求一无所知。
  • @AbdónAraya 。 . .您的数据模型中没有任何内容可以识别“用户”。 文本表格式的样本数据和所需结果会有所帮助。你为什么用十六进制格式表示日期?这使得这个问题更难理解。
  • @GordonLinoff ID 对应于用户,并且日期以某种方式由 ms 脚本生成器生成为十六进制
  • 听起来像是一个经典的“时间序列间隙”问题。看看这是否有帮助:tomaslind.net/2015/07/07/how-to-fill-in-gaps-in-time-series,Joe Celko 的“时间序列”作品在这类事情上也很出色

标签: sql sql-server sql-server-2008 common-table-expression


【解决方案1】:

您可以使用您的ROW_NUMBER()s 加入同一个CTE 并寻找差距。这是我的建议:

;WITH CTE AS (
    SELECT *, RN = ROW_NUMBER()OVER(PARTITION BY ID ORDER BY INI DESC)
    FROM BASERECURRENTES
), CTE2 AS (
    SELECT * 
         , (SELECT TOP 1 FIN FROM CTE b WHERE a.ID = b.ID and b.RN > a.RN) ENDOFLAST
    FROM CTE a
), CTE3 AS (
    SELECT *
         , DATEDIFF(DAY, INI, ENDOFLAST) GAPDAYS
    FROM CTE2
)
SELECT *
FROM CTE3
WHERE GAPDAYS < -1
ORDER BY ID ASC

以及指向SQL Fiddle 的链接。我没有完全理解实际的消息,但我认为结果集包含创建消息所需的一切。

【讨论】:

    【解决方案2】:

    我认为通过将每个用户的跨度与用户之前的跨度进行比较,这将为您提供您正在寻找的“递归”结果。这将所有结果汇总给用户;从 CTE2 中选择可以返回有间隙或 Ini/Fin 反转的跨度。

    ;WITH CTE AS (
        SELECT *,
            Person_RN = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY INI, FIN),
            Person_Count = COUNT(*) OVER (PARTITION BY ID)
        FROM BASERECURRENTES
    ), CTE2 AS (
        SELECT T.*,
            N_Days_Logged_On = 
            CASE 
                WHEN DATEDIFF(DAY, T.Ini, T.Fin) < 0 THEN NULL -- Reversed span
                ELSE DATEDIFF(DAY, T.Ini, T.Fin) 
            END,
            N_Days_Gap_From_Prev = 
            CASE
                WHEN T.Person_RN = 1 THEN NULL -- First span
                WHEN DATEADD(DAY, 1, tPrev.Fin) = t.Ini THEN 0 -- OP allowed a one-day gap
                ELSE DATEDIFF(DAY, tPrev.Fin, T.Ini) 
            END,
            tPrev.Ini AS Ini_Prev, tPrev.Fin AS Fin_Prev
        FROM CTE T
        LEFT JOIN CTE TPrev ON T.ID = TPrev.ID AND TPrev.Person_RN = (T.Person_RN - 1)
        -- LEFT JOIN CTE TNext ON T.ID = TNext.ID AND TNext.Person_RN = (T.Person_RN + 1) 
    ), CTE3 AS (
        SELECT ID,
            GD = SUM(N_Days_Gap_From_Prev),
            N_Days_Logged_On = SUM(N_Days_Logged_On),
            N_Logs = MAX(Person_Count),
            N_Spans_WO_Gaps = COUNT(CASE WHEN N_Days_Gap_From_Prev = 0 THEN ID ELSE NULL END),
            N_Spans_W_Gaps = COUNT(NULLIF(N_Days_Gap_From_Prev, 0)),
            -- Captures reversed/invalid(?) spans, plus incomplete spans (null Ini or Fin)
            N_Spans_Suspect = COUNT(CASE WHEN N_Days_Logged_On IS NULL THEN ID ELSE NULL END)
        FROM CTE2
        GROUP BY ID
    )
    SELECT *
    FROM CTE3
    ORDER BY ID
    

    我对此做了几个假设。从您的示例中,看起来登录/注销是存储为日期时间的日期,但时间部分不是问题(例如,00:00 的注销与 23:59 相同)。正如您在 CTE2 中看到的那样,DATEDIFF 结果不一定直观(例如,2019-01-01 到 2019-01-31 的 DATEDIFF 是 30 天)。

    从样本数据中,返回:

    • PERSONA1:4 个日志,0 个间隔天,116 天登录,0 个间隔
    • PERSONA2:3 个日志,21 个间隔天,6 天登录,所有跨度都有间隔。

    【讨论】:

    • 这几乎正是我所要求的,但是通过您的代码,我设法完成了要求!谢谢!
    猜你喜欢
    • 1970-01-01
    • 2019-04-17
    • 2018-02-06
    • 1970-01-01
    • 2014-12-06
    • 1970-01-01
    • 2015-02-13
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多