This article covers an interesting relationship between ROW_NUMBER() and DENSE_RANK()(RANK() 函数没有特别处理)。当您需要在SELECT DISTINCT 语句上生成ROW_NUMBER() 时,ROW_NUMBER() will produce distinct values before they are removed by the DISTINCT keyword。例如。这个查询
SELECT DISTINCT
v,
ROW_NUMBER() OVER (ORDER BY v) row_number
FROM t
ORDER BY v, row_number
...可能会产生这个结果(DISTINCT 无效):
+---+------------+
| V | ROW_NUMBER |
+---+------------+
| a | 1 |
| a | 2 |
| a | 3 |
| b | 4 |
| c | 5 |
| c | 6 |
| d | 7 |
| e | 8 |
+---+------------+
而这个查询:
SELECT DISTINCT
v,
DENSE_RANK() OVER (ORDER BY v) row_number
FROM t
ORDER BY v, row_number
...在这种情况下产生你可能想要的东西:
+---+------------+
| V | ROW_NUMBER |
+---+------------+
| a | 1 |
| b | 2 |
| c | 3 |
| d | 4 |
| e | 5 |
+---+------------+
请注意,DENSE_RANK() 函数的 ORDER BY 子句将需要 SELECT DISTINCT 子句中的所有其他列才能正常工作。
比较所有三个函数
使用 PostgreSQL / Sybase / SQL 标准语法(WINDOW 子句):
SELECT
v,
ROW_NUMBER() OVER (window) row_number,
RANK() OVER (window) rank,
DENSE_RANK() OVER (window) dense_rank
FROM t
WINDOW window AS (ORDER BY v)
ORDER BY v
...你会得到:
+---+------------+------+------------+
| V | ROW_NUMBER | RANK | DENSE_RANK |
+---+------------+------+------------+
| a | 1 | 1 | 1 |
| a | 2 | 1 | 1 |
| a | 3 | 1 | 1 |
| b | 4 | 4 | 2 |
| c | 5 | 5 | 3 |
| c | 6 | 5 | 3 |
| d | 7 | 7 | 4 |
| e | 8 | 8 | 5 |
+---+------------+------+------------+