删除mysql数据库中重复的ip地址条目答案

【问题标题】：Delete duplicate ip address entry in mysql database删除mysql数据库中重复的ip地址条目
【发布时间】：2012-10-17 11:37:37
【问题描述】：

我的数据库中有重复的 IP 地址记录，如下所示：

id | ipaddress
1    192.168.xxx.xxx
2    192.168.xxx.xxx
3    111.118.xxx.xxx
4    111.118.xxx.xxx

我想在我的领域中使用唯一的 IP 地址。我应该如何删除所有重复的条目？

谢谢

【问题讨论】：

您希望这些 id 发生什么？
这是一次性修复还是正在进行？
删除重复项后，对该列设置唯一约束。

标签： mysql sql ip-address

【解决方案1】：

由于不能在子选择中引用表的愚蠢限制，在 MySQL 中删除重复项有点棘手。因此子选择需要重新写入join：

DELETE d
FROM mytable d
LEFT JOIN (
   SELECT min(id) as min_id
   FROM   mytable
   GROUP BY trim(ipaddress)
) tokeep ON tokeep.min_id = d.id
WHERE keep.min_id IS NULL;

SQLFiddle 演示：http://sqlfiddle.com/#!2/9cfb9c/1

编辑

实际上有一种方法可以绕过愚蠢的子选择限制。如果该表被包装到子选择中的派生表中，则 MySQL 解析器不会注意到这一点，并愉快地使用子选择删除：

delete mt 
from mytable mt
where exists (
    select * 
    from (
      select id, ipaddress
      from mytable
    ) ex
    where TRIM(ex.ipaddress) = TRIM(mt.ipaddress)
   and ex.id < mt.id
)

【讨论】：

现在隐藏selfjoin是一个丑陋的语法结构。
@wildplasser：selfjoin 不是“隐藏的”。这是需要解决的子选择。我同意：这很丑陋（而且是一个真的愚蠢的限制）。查看我的编辑以了解真的丑陋的解决方法。
支撑版本是与计划生成器玩捉迷藏的巧妙方法。它是否强制“物化”子选择结果表？注意：分组子选择似乎也可以工作。
@wildplasser：我不知道。而且我实际上不确定这是否“支持”。也许它会在后台做一些完全不同的事情而不告诉 - 毕竟它是 MySQL

【解决方案2】：

CREATE TABLE mytable
        (id SERIAL NOT NULL PRIMARY KEY
        , ipaddress varchar
        );
INSERT INTO mytable(id, ipaddress) VALUES
 (1, '192.168.xxx.xxx')
,(2, '192.168.xxx.xxx ')        --<< note trailing whitespace
,(3, '111.118.xxx.xxx')
,(4, '111.118.xxx.xxx')
        ;
SELECT * FROM mytable;

DELETE FROM mytable mt
WHERE EXISTS (
  SELECT * FROM mytable ex
  WHERE ex.ipaddress = mt.ipaddress
  AND ex.id < mt.id
  )
  ;
SELECT * FROM mytable;

DELETE FROM mytable mt
WHERE EXISTS (
  SELECT * FROM mytable ex
  WHERE TRIM(ex.ipaddress) = TRIM(mt.ipaddress)
  AND ex.id < mt.id
  )
  ;
SELECT * FROM mytable;

输出：

CREATE TABLE
INSERT 0 4
 id |    ipaddress     
----+------------------
  1 | 192.168.xxx.xxx
  2 | 192.168.xxx.xxx 
  3 | 111.118.xxx.xxx
  4 | 111.118.xxx.xxx
(4 rows)

DELETE 1
 id |    ipaddress     
----+------------------
  1 | 192.168.xxx.xxx
  2 | 192.168.xxx.xxx 
  3 | 111.118.xxx.xxx
(3 rows)

DELETE 1
 id |    ipaddress    
----+-----------------
  1 | 192.168.xxx.xxx
  3 | 111.118.xxx.xxx
(2 rows)

更新：添加了测试数据并将一条记录更改为具有尾随空格。

注意：字符串函数的名称可能因 DMBS 实现而异。 TRIM() 函数适用于 postgres，也许 mysql 对同一事物有另一个名称。

UPDATE2：由于 mysql 似乎不允许在 delete 语句中使用 selfjoins，因此一种解决方法是使用带有您（不）想要保留的记录的 id 的辅助表。

（@ahose_with_no_name 的解决方案更短，但这个解决方案试图保持接近普通 SQL）：

CREATE table without_dups(id INTEGER NOT NULL);
INSERT INTO without_dups(id)
SELECT id
FROM mytable mt
WHERE NOT EXISTS (
  SELECT * FROM mytable ex
  WHERE ex.ipaddress = mt.ipaddress
  AND ex.id < mt.id
  )
  ;

DELETE FROM mytable mt
WHERE NOT EXISTS (
  SELECT * FROM without_dups nx
  WHERE nx.id = mt.id
  )
  ;

DROP TABLE without_dups;

SELECT * FROM mytable;

【讨论】：

也许 ipaddress 是一个 varchar(xxx) 字段并且某些记录有尾随空格？
它显示以下错误：MySQL said: #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'mt WHERE EXISTS (SELECT * FROM mytable ex WHERE ex.ipaddress = mt.ipaddre' at line 1
MySQL 不允许您从中删除的表在子选择中使用。
对不起。在那种情况下，我恐怕无法帮助 OP。 OTOH：也许把它塞进一个视图？还是CTE？ ;-)
MySQL 没有 CTE，更不用说可写的了。

【解决方案3】：

试试这个

DELETE * FROM MyTable AS aa INNER JOIN (
  SELECT MIN(id) as MID, id, ipaddress FROM MyTable
  GROUP BY id, ipaddress HAVING COUNT(*) > 1
) AS bb ON bb.id = aa.id AND bb.ipaddress = aa.ipaddress
  AND bb.MID <> aa.id;

Visit this link

【讨论】：