【问题标题】:Clever way to reorder records?重新排序记录的巧妙方法?
【发布时间】:2021-02-02 10:38:04
【问题描述】:

我有一个包含以下数据的表格:

VehicleID       Time                    ContextID                   Value
--------------- ----------------------- --------------------------  -----
359586015047188 2021-02-01 07:27:14.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-01 07:53:38.000 SafeProtectProtectionLevel  5
359586015047188 2021-02-01 07:53:47.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-01 10:24:20.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-01 10:26:46.000 SafeProtectProtectionLevel  5
359586015047188 2021-02-01 10:26:55.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-01 10:43:53.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-01 10:46:01.000 SafeProtectProtectionLevel  5
359586015047188 2021-02-01 10:46:09.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-01 11:02:16.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-01 14:39:41.777 SafeProtectProtectionLevel  5
359586015047188 2021-02-01 14:39:42.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-01 14:55:48.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-02 07:52:12.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-02 07:52:12.777 SafeProtectProtectionLevel  5
359586015047188 2021-02-02 07:52:32.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-02 07:53:57.000 SafeProtectProtectionLevel  5
359586015047188 2021-02-02 07:54:10.777 SafeProtectDeviceConnected  True

如您所见,数据是及时插入的。但有时,时间是不正确的。我需要一个查询来修复它。

如果数据是一致的,我总是会有一行 ContextID 的值是 SafeProtectDeviceConnected 并且 Value 的值是 True 后跟一行 ContextID 的值 SafeProtectProtectionLevel 无论 Value 的值是什么列。

我发现我可以使用 LAG 分析函数来访问前一行的值,并且我还可以将一个或多个 CASE 放入 order by clause

那么在前一个结果集上应用修复查询后的正确结果将是:

VehicleID       Time                    ContextID                   Value
--------------- ----------------------- --------------------------  -----
359586015047188 2021-02-01 07:27:14.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-01 07:53:47.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-01 07:53:38.000 SafeProtectProtectionLevel  5
359586015047188 2021-02-01 10:24:20.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-01 10:26:55.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-01 10:26:46.000 SafeProtectProtectionLevel  5
359586015047188 2021-02-01 10:43:53.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-01 10:46:09.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-01 10:46:01.000 SafeProtectProtectionLevel  5
359586015047188 2021-02-01 11:02:16.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-01 14:39:42.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-01 14:39:41.777 SafeProtectProtectionLevel  5
359586015047188 2021-02-01 14:55:48.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-02 07:52:12.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-02 07:52:12.777 SafeProtectProtectionLevel  5
359586015047188 2021-02-02 07:52:32.777 SafeProtectDeviceConnected  False
359586015047188 2021-02-02 07:54:10.777 SafeProtectDeviceConnected  True
359586015047188 2021-02-02 07:53:57.000 SafeProtectProtectionLevel  5

基本上,如果我们查看上面重新排序的结果集,特别是从上到下的 Value 列,我们应该有一个 True 值,后跟一个或多个带有数值的行,然后是一个 False

到目前为止我尝试过的是使用ORDER BY t.VehiculeID, /*dbo.ContextDetail.Time*/ CASE WHEN t.ContextID='SafeProtectDeviceConnected' AND t.Value='True' THEN 1 END, CASE WHEN t.ContextID='SafeProtectProtectionLevel' THEN 2 END, CASE WHEN t.ContextID='SafeProtectDeviceConnected' AND t.Value='False' THEN 3 END

但它给了我(显然)所有False 行,然后是所有numeric rows,然后是剩余的True 值。

这个问题是缝隙和孤岛问题吗?
解决这个问题的正确方法是什么?

【问题讨论】:

  • 试试这个:ORDER BY VehicleID, CAST(CONVERT(NVARCHAR(8), [Time], 112) AS BIGINT)*10000 + DATEPART(HOUR, [Time])* 100 + DATEPART(MINUTE , [Time]), CASE WHEN ContextID = 'SafeProtectDeviceConnected' AND [Value] = 'False' THEN 1 WHEN ContextID = 'SafeProtectDeviceConnected' AND [Value] = 'True' THEN 2 ELSE 3 END
  • @Tyron78 这似乎可行,但我想了解原因。
  • 我添加了我的查询作为答案和一个小解释 - 请检查。如有其他问题,请随时提问。

标签: sql-server gaps-and-islands


【解决方案1】:

试试这个:

ORDER BY VehicleID, CAST(CONVERT(NVARCHAR(8), [Time], 112) AS BIGINT)*10000 + DATEPART(HOUR, [Time])* 100 + DATEPART(MINUTE, [Time]), CASE WHEN ContextID = 'SafeProtectDeviceConnected' AND [Value] = 'False' THEN 1 WHEN ContextID = 'SafeProtectDeviceConnected' AND [Value] = 'True' THEN 2 ELSE 3 END

与您的方法的主要区别:首先,我详细说明了订单中小于分钟的所有内容,因为您提到了,那个时间可能是错误的。但是,据我(之后)注意到,时间似乎还可以,但不是唯一的-因此按时间排序也应该可以解决问题。其次,我创建了一个 case 语句来考虑上下文和值,而不是每列一个 case - 您希望按两列的组合排序,而不是单独按每一列排序。因此,Connected + False 的组合得到 1,Connected + True 得到 2,其他一切都得到 3。 在您的查询中,您创建了三个返回值或 NULL 的案例(CASE... END 没有其他),所以毕竟您添加了三个值来排序。

【讨论】:

    【解决方案2】:

    只要必须分组的行之间的时间不重叠,那么您可以使用Time 列对行进行分组,并使用case 表达式来确定单个组内的排序。下面的解决方案在一个公用表表达式中计算这些新列,以便于选择和排序。

    样本数据

    create table log
    (
      VehicleID bigint,
      Time datetime,
      ContextID nvarchar(50),
      Value nvarchar(10)
    );
    
    insert into log (VehicleID, Time, ContextID, Value) values
    (359586015047188, '2021-02-01 07:27:14.777', 'SafeProtectDeviceConnected', 'False'),
    (359586015047188, '2021-02-01 07:53:38.000', 'SafeProtectProtectionLevel', '5'),
    (359586015047188, '2021-02-01 07:53:47.777', 'SafeProtectDeviceConnected', 'True'),
    (359586015047188, '2021-02-01 10:24:20.777', 'SafeProtectDeviceConnected', 'False'),
    (359586015047188, '2021-02-01 10:26:46.000', 'SafeProtectProtectionLevel', '5'),
    (359586015047188, '2021-02-01 10:26:55.777', 'SafeProtectDeviceConnected', 'True'),
    (359586015047188, '2021-02-01 10:43:53.777', 'SafeProtectDeviceConnected', 'False'),
    (359586015047188, '2021-02-01 10:46:01.000', 'SafeProtectProtectionLevel', '5'),
    (359586015047188, '2021-02-01 10:46:09.777', 'SafeProtectDeviceConnected', 'True'),
    (359586015047188, '2021-02-01 11:02:16.777', 'SafeProtectDeviceConnected', 'False'),
    (359586015047188, '2021-02-01 14:39:41.777', 'SafeProtectProtectionLevel', '5'),
    (359586015047188, '2021-02-01 14:39:42.777', 'SafeProtectDeviceConnected', 'True'),
    (359586015047188, '2021-02-01 14:55:48.777', 'SafeProtectDeviceConnected', 'False'),
    (359586015047188, '2021-02-02 07:52:12.777', 'SafeProtectDeviceConnected', 'True'),
    (359586015047188, '2021-02-02 07:52:12.777', 'SafeProtectProtectionLevel', '5'),
    (359586015047188, '2021-02-02 07:52:32.777', 'SafeProtectDeviceConnected', 'False'),
    (359586015047188, '2021-02-02 07:53:57.000', 'SafeProtectProtectionLevel', '5'),
    (359586015047188, '2021-02-02 07:54:10.777', 'SafeProtectDeviceConnected', 'True');
    

    解决方案

    备注:你可以将 case 表达式缩小一点。为了清楚起见,这里完整地写出来。

    with cte as
    (
      select l.VehicleID,
             l.Time,
             l.ContextID,
             l.Value,
             (row_number() over(order by l.Time)-1)/3 as GroupNum,
             case
               when l.ContextID = 'SafeProtectDeviceConnected' and l.Value = 'False' then 1
               when l.ContextID = 'SafeProtectDeviceConnected' and l.Value = 'True'  then 2
               when l.ContextID = 'SafeProtectProtectionLevel'                       then 3
             end as GroupSort
      from log l
    )
    select cte.VehicleID,
           cte.Time,
           cte.ContextID,
           cte.Value
    from cte
    order by cte.GroupNum,
             cte.GroupSort;
    

    结果

    VehicleID        Time                     ContextID                   Value
    ---------------  -----------------------  --------------------------  -----
    359586015047188  2021-02-01 07:27:14.777  SafeProtectDeviceConnected  False
    359586015047188  2021-02-01 07:53:47.777  SafeProtectDeviceConnected  True
    359586015047188  2021-02-01 07:53:38.000  SafeProtectProtectionLevel  5
    359586015047188  2021-02-01 10:24:20.777  SafeProtectDeviceConnected  False
    359586015047188  2021-02-01 10:26:55.777  SafeProtectDeviceConnected  True
    359586015047188  2021-02-01 10:26:46.000  SafeProtectProtectionLevel  5
    359586015047188  2021-02-01 10:43:53.777  SafeProtectDeviceConnected  False
    359586015047188  2021-02-01 10:46:09.777  SafeProtectDeviceConnected  True
    359586015047188  2021-02-01 10:46:01.000  SafeProtectProtectionLevel  5
    359586015047188  2021-02-01 11:02:16.777  SafeProtectDeviceConnected  False
    359586015047188  2021-02-01 14:39:42.777  SafeProtectDeviceConnected  True
    359586015047188  2021-02-01 14:39:41.777  SafeProtectProtectionLevel  5
    359586015047188  2021-02-01 14:55:48.777  SafeProtectDeviceConnected  False
    359586015047188  2021-02-02 07:52:12.777  SafeProtectDeviceConnected  True
    359586015047188  2021-02-02 07:52:12.777  SafeProtectProtectionLevel  5
    359586015047188  2021-02-02 07:52:32.777  SafeProtectDeviceConnected  False
    359586015047188  2021-02-02 07:54:10.777  SafeProtectDeviceConnected  True
    359586015047188  2021-02-02 07:53:57.000  SafeProtectProtectionLevel  5
    

    Fiddle 了解实际情况。

    【讨论】:

    • 能否请您澄清在 GroupNum 计算中使用的 -1 和 3 值的选择?
    • row_number() 函数产生1, 2, 3, 4, 5, 6, ... 减一使其从零开始0, 1, 2, 3, 4, 5, ... 整数除以三(一组中的行数)将其变为0, 0, 0, 1, 1, 1, ... 没有初始减法只有 2 个零,分组不正确。
    • 谢谢,我必须确认我在真实数据中的一组只有3行。
    • 我发现我的“SafeProtectProtectionLevel”行数可能未知。所以@Tyron78 的答案最适合我的情况。
    猜你喜欢
    • 2011-05-11
    • 1970-01-01
    • 2012-09-17
    • 2014-02-24
    • 1970-01-01
    • 1970-01-01
    • 2011-03-20
    • 1970-01-01
    • 2021-07-28
    相关资源
    最近更新 更多