【发布时间】:2018-08-13 17:32:18
【问题描述】:
给定一组行,有时有一个字段null,有时没有:
SELECT
Date, TheThing
FROM MyData
ORDER BY Date
Date TheThing
----------------------- --------
2016-03-09 08:17:29.867 a
2016-03-09 08:18:33.327 a
2016-03-09 14:32:01.240 NULL
2016-10-21 19:53:49.983 NULL
2016-11-12 03:25:21.753 b
2016-11-24 07:43:24.483 NULL
2016-11-28 16:06:23.090 b
2016-11-28 16:09:07.200 c
2016-12-10 11:21:55.807 c
我想要一个计算非空值的排名列:
Date TheThing DesiredTotal
----------------------- -------- ------------
2016-03-09 08:17:29.867 a 1
2016-03-09 08:18:33.327 a 2
2016-03-09 14:32:01.240 NULL 2 <---notice it's still 2 (good)
2016-10-21 19:53:49.983 NULL 2 <---notice it's still 2 (good)
2016-11-12 03:25:21.753 b 3
2016-11-24 07:43:24.483 NULL 3 <---notice it's still 3 (good)
2016-11-28 16:06:23.090 b 4
2016-11-28 16:09:07.200 c 5
2016-12-10 11:21:55.807 c 6
我尝试显而易见的:
SELECT
Date, TheThing,
RANK() OVER(ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
但是RANK() 计算空值:
Date TheThing Total
----------------------- -------- -----
2016-03-09 08:17:29.867 a 1
2016-03-09 08:18:33.327 a 2
2016-03-09 14:32:01.240 NULL 3 <--- notice it is 3 (bad)
2016-10-21 19:53:49.983 NULL 4 <--- notice it is 4 (bad)
2016-11-12 03:25:21.753 b 5 <--- and all the rest are wrong (bad)
2016-11-24 07:43:24.483 NULL 7
2016-11-28 16:06:23.090 b 8
2016-11-28 16:09:07.200 c 9
2016-12-10 11:21:55.807 c 10
如何指示RANK()(或DENSE_RANK())不计算空值?
您是否尝试过使用分区?
为什么是的!更糟糕的是:
SELECT
Date, TheThing,
RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE 0 END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
但RANK() 计算空值:
Date TheThing Total
----------------------- -------- -----
2016-03-09 08:17:29.867 a 1
2016-03-09 08:18:33.327 a 2
2016-03-09 14:32:01.240 NULL 1 <--- reset to 1?
2016-10-21 19:53:49.983 NULL 2 <--- why go up?
2016-11-12 03:25:21.753 b 3
2016-11-24 07:43:24.483 NULL 3 <--- didn't reset?
2016-11-28 16:06:23.090 b 4
2016-11-28 16:09:07.200 c 5
2016-12-10 11:21:55.807 c 6
现在我随机输入东西 - 疯狂地挥舞着。
SELECT
Date, TheThing,
RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE NULL END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
SELECT
Date, TheThing,
DENSE_RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE NULL END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
编辑:有了所有答案,我需要进行多次迭代才能找到我不想要的所有边缘情况。最后,我在概念上想要的是OVER(),以便计数。我不知道OVER 适用于RANK(和DENSE_RANK)以外的任何东西。
http://sqlfiddle.com/#!18/c6d87/1
阅读奖励
【问题讨论】:
-
我不认为您是否可以在单个查询中执行此操作。首先,您需要过滤非 Nulls 的记录,然后使用之前创建的排名选择所有记录。
-
TheThing是不是除了Frob或null之外的任何东西?这会影响运行总数吗?请显示所有边缘情况。 -
Rank()和case when TheThing is NULL then Lag ... else TheThing end使用前一行的值是否可以让您获得任何帮助?如果发生这种情况,它可能需要一个软糖因子来处理初始null值。 -
@HABO 或连续两个 NULL,因为 LAG() 必须知道的不是常量,而是最后一个非 NULL 有多少行。
标签: sql sql-server tsql