【问题标题】:sql query distinct with Row_Numbersql 查询与 Row_Number 不同
【发布时间】:2013-08-09 21:02:51
【问题描述】:

我正在与 sql 中的 distinct 关键字作斗争。 我只想在一列中显示唯一 (distinct) 值的所有行号,所以我尝试了:

SELECT DISTINCT id, ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM table
WHERE fid = 64

但是下面的代码给了我distinct 值:

SELECT distinct id FROM table WHERE fid = 64

但是当用Row_Number尝试它时。
那么它就不起作用了。

【问题讨论】:

    标签: sql distinct


    【解决方案1】:

    试试这个

    SELECT distinct id
    FROM  (SELECT id, ROW_NUMBER() OVER (ORDER BY  id) AS RowNum
          FROM table
          WHERE fid = 64) t
    

    或者使用RANK()代替行号并选择记录DISTINCT rank

    SELECT id
    FROM  (SELECT id, ROW_NUMBER() OVER (PARTITION BY  id ORDER BY  id) AS RowNum
          FROM table
          WHERE fid = 64) t
    WHERE t.RowNum=1
    

    这也返回不同的 ids

    【讨论】:

      【解决方案2】:

      使用这个:

      SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS RowNum FROM
          (SELECT DISTINCT id FROM table WHERE fid = 64) Base
      

      并将查询的“输出”作为另一个查询的“输入”。

      使用 CTE:

      ; WITH Base AS (
          SELECT DISTINCT id FROM table WHERE fid = 64
      )
      
      SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS RowNum FROM Base
      

      这两个查询应该是等价的。

      技术上你可以

      SELECT DISTINCT id, ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) AS RowNum 
          FROM table
          WHERE fid = 64
      

      但是如果你增加DISTINCT字段的数量,你必须把所有这些字段都放在PARTITION BY中,例如

      SELECT DISTINCT id, description,
          ROW_NUMBER() OVER (PARTITION BY id, description ORDER BY id) AS RowNum 
          FROM table
          WHERE fid = 64
      

      我什至希望你理解你在这里违反了标准命名约定,id 可能应该是一个主键,根据定义是唯一的,所以 DISTINCT 将毫无用处,除非你将查询与一些JOINs/UNION ALL...

      【讨论】:

        【解决方案3】:

        试试这个:

        ;WITH CTE AS (
                       SELECT DISTINCT id FROM table WHERE fid = 64
                     )
        SELECT id, ROW_NUMBER() OVER (ORDER BY  id) AS RowNum
          FROM cte
         WHERE fid = 64
        

        【讨论】:

          【解决方案4】:

          怎么样

          ;WITH DistinctVals AS (
                  SELECT  distinct id 
                  FROM    table 
                  where   fid = 64
              )
          SELECT  id,
                  ROW_NUMBER() OVER (ORDER BY  id) AS RowNum
          FROM    DistinctVals
          

          SQL Fiddle DEMO

          你也可以试试

          SELECT distinct id, DENSE_RANK() OVER (ORDER BY  id) AS RowNum
          FROM @mytable
          where fid = 64
          

          SQL Fiddle DEMO

          【讨论】:

            【解决方案5】:

            这可以很简单,你已经很接近了

            SELECT distinct id, DENSE_RANK() OVER (ORDER BY  id) AS RowNum
            FROM table
            WHERE fid = 64
            

            【讨论】:

            • 这个比选择的答案好很多。
            【解决方案6】:

            This article covers an interesting relationship between ROW_NUMBER() and DENSE_RANK()RANK() 函数没有特别处理)。当您需要在SELECT DISTINCT 语句上生成ROW_NUMBER() 时,ROW_NUMBER() will produce distinct values before they are removed by the DISTINCT keyword。例如。这个查询

            SELECT DISTINCT
              v, 
              ROW_NUMBER() OVER (ORDER BY v) row_number
            FROM t
            ORDER BY v, row_number
            

            ...可能会产生这个结果(DISTINCT 无效):

            +---+------------+
            | V | ROW_NUMBER |
            +---+------------+
            | a |          1 |
            | a |          2 |
            | a |          3 |
            | b |          4 |
            | c |          5 |
            | c |          6 |
            | d |          7 |
            | e |          8 |
            +---+------------+
            

            而这个查询:

            SELECT DISTINCT
              v, 
              DENSE_RANK() OVER (ORDER BY v) row_number
            FROM t
            ORDER BY v, row_number
            

            ...在这种情况下产生你可能想要的东西:

            +---+------------+
            | V | ROW_NUMBER |
            +---+------------+
            | a |          1 |
            | b |          2 |
            | c |          3 |
            | d |          4 |
            | e |          5 |
            +---+------------+
            

            请注意,DENSE_RANK() 函数的 ORDER BY 子句将需要 SELECT DISTINCT 子句中的所有其他列才能正常工作。

            比较所有三个函数

            使用 PostgreSQL / Sybase / SQL 标准语法(WINDOW 子句):

            SELECT
              v,
              ROW_NUMBER() OVER (window) row_number,
              RANK()       OVER (window) rank,
              DENSE_RANK() OVER (window) dense_rank
            FROM t
            WINDOW window AS (ORDER BY v)
            ORDER BY v
            

            ...你会得到:

            +---+------------+------+------------+
            | V | ROW_NUMBER | RANK | DENSE_RANK |
            +---+------------+------+------------+
            | a |          1 |    1 |          1 |
            | a |          2 |    1 |          1 |
            | a |          3 |    1 |          1 |
            | b |          4 |    4 |          2 |
            | c |          5 |    5 |          3 |
            | c |          6 |    5 |          3 |
            | d |          7 |    7 |          4 |
            | e |          8 |    8 |          5 |
            +---+------------+------+------------+
            

            【讨论】:

              【解决方案7】:

              使用DISTINCT 会在您添加字段时导致问题,并且它还可以掩盖您选择中的问题。像这样使用GROUP BY 作为替代方案:

              SELECT id
                    ,ROW_NUMBER() OVER (ORDER BY  id) AS RowNum
                FROM table
               where fid = 64
               group by id
              

              然后您可以像这样从您的选择中添加其他有趣的信息:

              ,count(*) as thecount
              

              ,max(description) as description
              

              【讨论】:

              • 赞成使用group by。但我认为这里不需要partition by
              • @P5Coder ,你当然是对的。我已经修好了。当我把它放在那里时,我不知道我在想什么。
              【解决方案8】:

              问题太老了,我的回答可能不会增加太多,但这是我的两分钱,可以让查询有点用处:

              ;WITH DistinctRecords AS (
                  SELECT  DISTINCT [col1,col2,col3,..] 
                  FROM    tableName 
                  where   [my condition]
              ), 
              serialize AS (
                 SELECT
                  ROW_NUMBER() OVER (PARTITION BY [colNameAsNeeded] ORDER BY  [colNameNeeded]) AS Sr,*
                  FROM    DistinctRecords 
              )
              SELECT * FROM serialize 
              

              使用两个 cte 的用处在于,现在您可以在查询中更轻松地使用序列化记录,并且非常轻松地执行 count(*) 等。

              DistinctRecords 将选择所有不同的记录,serialize 将序列号应用于不同的记录。之后,您可以将最终的序列化结果用于您的目的,而不会造成混乱。

              Partition By 在大多数情况下可能不需要

              【讨论】:

                猜你喜欢
                • 1970-01-01
                • 1970-01-01
                • 1970-01-01
                • 1970-01-01
                • 1970-01-01
                • 1970-01-01
                • 2015-10-26
                • 2012-10-22
                • 1970-01-01
                相关资源
                最近更新 更多