SQL Server：如何返回不重复的所有列？答案

【问题标题】：SQL Server: How do I return all columns where one is not repeated?SQL Server：如何返回不重复的所有列？
【发布时间】：2012-05-13 12:34:27
【问题描述】：

我有一张带有重复 ID 的表格，稍后我会修复它。基本上我想返回 ID 不同的所有行，但我想要整个行。比如：

select * from table group by ID

select * from table where (ID is not repeated)

在这种情况下，它们是相同的行，所以我不在乎它是 First 还是 Last，Min 还是 Max。

请注意，我不想这样做：

select MIN(col1), MIN(col2), ... from table group by ID

我想要一种方法来获得此结果，而无需枚举每一列。

编辑：我使用的是 SQL Server 2008 R2。

【问题讨论】：

标签： sql sql-server tsql distinct group-concat

【解决方案1】：

如果您使用的是 MySql，请执行以下操作：

select 
    *
from tbl
group by ID

MySQL 现场测试：http://www.sqlfiddle.com/#!2/8c7fd/2

如果您使用的是 Postgresql，请执行以下操作：

select distinct on(id)
    *
from tbl
order by id

如果您希望 Postgresql DISTINCT ON 至少与 CTE 窗口函数一样可预测。对另一列进行排序：

select distinct on(id)
    *
from tbl
order by id
   , someColumnHere -- Choose ASC for first row, DESC for last row

Postgresql 现场测试：http://www.sqlfiddle.com/#!1/8c7fd/1

如果您使用的是支持 CTE 窗口的数据库（例如 Postgres、Oracle、Sql Server），请使用：

with ranked as
(
  select 
      rank() over(partition by id order by column) rn,
      *
  from tbl
)
select * from ranked where rn = 1

支持 CTE 窗口化的数据库：

Posgtresql：http://www.sqlfiddle.com/#!1/8c7fd/2

甲骨文：http://www.sqlfiddle.com/#!4/b5cf9/1

Sql 服务器：http://www.sqlfiddle.com/#!3/8c7fd/3

【讨论】：

除非 Id 是唯一的列，否则如果不为每列提供聚合函数，这将无法工作
谁说如果你想做一个 GROUP BY，聚合函数在 MySql 上是必需的？在此处查看结果：sqlfiddle.com/#!2/8c7fd/1
op没有指定MySql，在其他数据库中是必须的。我很欣赏这一课，但我只是想提供帮助。
没关系，伙计 ;-) 确实，MySQL 违反了许多数据库基础，这就是为什么您认为我的第一个查询不正确，是的，它在几乎所有数据库平台上都不正确；但是，它在 MySQL 中是正确的。查看一个好的 GROUP BY 应该是什么样的示例并努力这样做：ienablemuch.com/2010/08/postgresql-recognizing-functional.html
我正在使用 SQL Server，所以这不是一个选项。很高兴知道我可以在 MySQL 中执行此操作，谢谢。

【解决方案2】：

由于您没有在问题中说明您使用的是什么数据库，我建议您进行一个适用于所有数据库平台的查询。但是这个查询需要你创建一个新列，其属性为 auto_number、identity、serial 等

这将是查询：

select * from tbl 
where (id,auto_number_here) in
   (select id, min(auto_number_here) 
    from tbl 
    group by id)

这将适用于许多平台，除了 Sql Server。 Sql Server 不支持元组。你必须这样做：

select * from tbl x
where 
   -- leave this comment, so it mimics the tuple capability
   -- (id,auto_number_here) in
   EXISTS
   (select
       -- Replace this:  
       -- id, min(auto_number_here) 

       -- With whatever floats your boat, 
       -- you can use 1, null(the value generated by Entity Framework's EXIST clause), 
       -- even 1/0 is ok :-) (this will not result to divide-by-zero error)

       -- But I prefer retaining the columns, so it mimics the tuple-capable database:
       id, min(auto_number_here) 

    from tbl 
    where id = x.id 
    group by id
    having x.auto_number_here = min(auto_number_here))

元组相关问题：using tuples in sql in clause

由于有些database不支持元组，你可以用simulate it代替

select z.* from tbl z
join (select id, min(auto_number_here) as first_row from tbl group by id) as x
on z.id = x.id and z.auto_number_here = x.first_row

它比 EXISTS 方法好一点。但是，如果您的数据库支持元组，请改用它；尽量只用JOIN来反映表关系，用WHERE子句过滤。

更新

也许一个具体的例子可以清楚地解释它，假设我们有一个我们忘记放置主键的现有表：

create table tbl(
  id varchar(5), -- supposedly primary key 
  data int,
  message varchar(100) 
);


insert into tbl values
('A',1,'the'),
('A',1,'quick'),
('A',4,'brown'),
('B',2, 'fox'),
('B',5, 'jumps'),
('B',5, 'over'),
('C',6, 'the'),
('C',7, 'lazy');

为了从重复中只提取一行，我们需要在现有数据上添加第三列。

这将帮助我们从重复项中提取一行且仅一行

alter table tbl add auto_number_here int identity(1,1) not null;

现在应该可以了：

select z.* from tbl z
join (select id, min(auto_number_here) as first_row from tbl group by id) as x
on z.id = x.id and z.auto_number_here = x.first_row

现场测试：http://www.sqlfiddle.com/#!6/19b55/3

这是这样的：

select * from tbl x
where 
   -- leave this comment, so it mimics the tuple capability
   -- (id,auto_number_here) in
   EXISTS
   (
     select
       -- Replace this:  
       -- id, min(auto_number_here) 

       -- With whatever floats your boat, 
       -- you can use 1, null(the value generated by Entity Framework's EXIST clause), 
       -- even 1/0 is ok :-) (this will not result to divide-by-zero error)

       -- But I prefer retaining the columns, so it mimics the tuple-capable database:
       id, min(auto_number_here) 

    from tbl 
    where id = x.id 
    group by id
    having x.auto_number_here = min(auto_number_here)

   )

现场测试：http://www.sqlfiddle.com/#!6/19b55/4

【讨论】：

什么是auto_number_here？我不确定我是否理解那部分。
自动编号列。就像 Sql Server 的身份一样
942 行，就像我使用 select *.只有 880 个唯一 ID。
您是否为整行创建了唯一编号。每行都应该有自己的唯一编号。您可以对该列使用 identity(1,1)。然后上面的查询将起作用。在MIN(auto_number_here) 表达式中为auto_number_here 使用没有重复值的列（理想情况下来自身份（sql server）、auto_number（mysql）、serial（postgresql））
不要在 MIN 聚合上使用具有重复值的列

【解决方案3】：

试试这个。它可能会起作用，因为两行中的所有列都是相同的。

select distinct *
from table

【讨论】：

这有点奏效。表中有 880 个结果，视图中有 942 个结果。我只想要 880（视图正在执行一些添加不需要的行的连接，我现在没有时间重写它们）。该命令返回 899，添加 19 个结果 - 显然它们并不完全相同 - 尽管 ID 列是。
在这种情况下，您将不得不在两行之间进行选择，写出 Min(Col1), ... 毕竟，SQL 不知道如何确定两个稍微不同的行中的哪一个返回。

【解决方案4】：

使用子查询来获取您的唯一 ID，然后使用它来过滤结果：

SELECT *
FROM YourTable t,
INNER JOIN (
  SELECT Id, COUNT(*) 'count'
  FROM YourTable
  GROUP BY Id
) sq ON sq.Id = t.Id
WHERE sq.count = 1

【讨论】：

第二个子查询有点矫枉过正。你可以做SELECT sqq.Id, COUNT(1) 'count' FROM YourTable sqq GROUP BY sqq.Id
@diaho 好点，不知道我在想什么。谢谢！
这两个都返回 831 行，其中有 880 个唯一行。本质上，我确实想要 count > 1 的那些，但只想要第一个。
建议 COUNT(1) 为 COUNT(*) 是 Cargo Cult Programming 。看到这个：ienablemuch.com/2010/04/… 这个：ienablemuch.com/2010/05/why-is-exists-select-1-cargo-cult.html
这里有点过分了，他可以将计数过滤器从 WHERE 子句移动到子查询内部，并将其放入 HAVING。 INNER JOIN(SELECT id FROM YourTable GROUP BY Id HAVING COUNT(*) = 1) sq ON sq.Id = t.Id。但同样，这不是 OP 所要求的，他不想要只有没有重复的行