SQL删除没有不同的重复记录答案

【问题标题】：SQL to remove duplicate records without a distinctSQL删除没有不同的重复记录
【发布时间】：2017-05-22 17:09:20
【问题描述】：

这是用于 Oracle DB SQL (pl/sql)

我有一个三列数据表（为了争论）。我需要删除返回的行，其中columnA 和columnB 与表中的另一条记录匹配并且columnC 等于'james'。但是如果 columnC 等于 'james' 并且 columnA,columnB 不匹配结果集中的任何其他行，则保留它。

ColumnA ColumnB ColumnC
_______________________
45      blue    John   <-Keep
45      blue    James  <-Remove
32      Red     John   <-Keep
32      Red     James  <-Remove
12      Yellow  James  <-Keep

结果集将是：

 ColumnA ColumnB ColumnC
 _______________________
 45      blue    John
 32      Red     John
 12      Yellow  James

显然真实数据更复杂，列也更多。我的背景是等式的 C# 方面，而不是 Oracle DB 方面。我已经尝试了一些临时表，但我无法得到任何接近工作的东西，因为我需要一些东西说“我返回了更多的一行，其中一个是詹姆斯记录”。感谢您的帮助。

【问题讨论】：

如果多行有45、蓝色等名字怎么办？
@GurwinderSingh 对于这个例子，我只想删除 James 记录。

标签： sql oracle

【解决方案1】：

这是使用窗口函数获取匹配记录数的一种方法：

SELECT
    columnA,
    columnB,
    columnC
FROM
    (
        SELECT 
            columnA,
            columnB,
            columnC,
            COUNT(*) OVER (PARTITION BY columnA, columnB) as rcount
        FROM table
    ) sub
WHERE 
    (sub.rcount = 2 AND columnC = 'John')
    OR sub.rcount = 1;

【讨论】：

这个答案太具体了。它不适用于稍微多一点的数据。
我能够实现这个来进行演示，但是在这一层（DB vs View）这样做的其他问题让我转向了不同的解决方案（即，这只是一个大型程序的一小部分我们不想重新测试）。但这是在我的工具箱中添加了另一个工具。谢谢
@GurwinderSingh 这个答案是针对这个问题的。 OP 的最大收获是，在分区上获取 COUNT(*) 将使他们能够发现该分区的重复记录。他们发现后希望过滤的内容取决于他们。比如WHERE sub.rcount > 1 AND columnC <> 'James'。

【解决方案2】：

您可以使用带有case 的解析count 函数检查给定的A 列和B 列组合是否存在非“詹姆斯”记录：

with your_table (ColumnA ,ColumnB ,ColumnC) as (
    select 45, 'blue'    ,'ABC'  from dual union all
    select 45, 'blue'    ,'Jimmy'  from dual union all
    select 45, 'blue'    ,'John'  from dual union all
    select 45, 'blue'    ,'James' from dual union all 
    select 32, 'Red'     ,'John'  from dual union all 
    select 32, 'Red'     ,'James' from dual union all 
    select 12, 'Yellow'  ,'James' from dual
    )
--Sample data ends. Solution starts below--


select ColumnA ,ColumnB ,ColumnC
from (
    select t.*,
        case when count(case when columnC <> 'James' then 1 end) over (
                    partition by columnA,
                    columnB
                    ) > 0 then 1 else 0 end as flag
    from your_table t
    )
where flag = 0
    or columnC <> 'James'

输出：

COLUMNA COLUMNB COLUMNC
12      Yellow  James
32      Red     John
45      blue    John
45      blue    Jimmy
45      blue    ABC

【讨论】：