在SQL Server 2005+:
SELECT oo.*
FROM (
SELECT DISTINCT ProductId
FROM Orders
) od
CROSS APPLY
(
SELECT TOP 1 ProductID, Date, CustomerID
FROM Orders oi
WHERE oi.ProductID = od.ProductID
ORDER BY
Date DESC
) oo
名义上,查询计划包含Nested Loops。
但是,外部循环将使用Index Scan 和Stream Aggregate,而内部循环将包含Index Seek 用于ProductID 和Top。
事实上,第二个操作几乎是免费的,因为在内循环中使用的索引页很可能会驻留在缓存中,因为它刚刚用于外循环。
这是1,000,000 行的测试结果(带有100 DISTINCT ProductID's):
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
(строк обработано: 100)
Table 'Orders'. Scan count 103, logical reads 6020, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 234 ms, elapsed time = 125 ms.
,虽然这只是 SELECT DISTINCT 查询的结果:
SELECT od.*
FROM (
SELECT DISTINCT ProductId
FROM Orders
) od
还有统计数据:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
(строк обработано: 100)
Table 'Orders'. Scan count 3, logical reads 5648, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 250 ms, elapsed time = 125 ms.
正如我们所看到的,性能是相同的,CROSS APPLY 只需要400 额外的logical reads(很可能永远不会是physical)。
看不到如何改进这个查询了。
此查询的另一个好处是它可以很好地并行化。您可能会注意到CPU 的时间是elapsed time 的两倍:这是由于我的旧Core Duo 上的并行化。
4-coreCPU 将在一半的时间内完成此查询。
使用窗口函数的解决方案不并行化:
SELECT od.*
FROM (
SELECT ProductId, Date, CustomerID, ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY Date DESC) AS rn
FROM Orders
) od
WHERE rn = 1
,以下是统计数据:
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
(строк обработано: 100)
Table 'Orders'. Scan count 1, logical reads 5123, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 406 ms, elapsed time = 415 ms.