您的位置:首页 > 其它

查找和删除重复记录

2010-08-11 15:47 176 查看
在数据库维护中,有时候会需要查找重复的记录,有时候会需要删除多余的重复记录,下面总结了一些方法,在实际应用中可以根据不同情况进行使用。

首先定义一下范例表如下:Table1 = ID + Column1 + Column2,其中 ID 可能为主键。

根据重复的不同情况可以有不同的查找和删除的方法:

1、ID 相同

-- 查找重复 ID 记录
SELECT * FROM dbo.Table1 WHERE ID IN (SELECT ID FROM dbo.Table1 GROUP BY ID HAVING COUNT(*) > 1);
-- 删除多余重复 ID 记录,保留的记录可以通过 ROW_NUMBER() 中的 ORDER BY 进行调整
SELECT ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID) AS RowNumber, * INTO #TempTable
FROM dbo.Table1 WHERE ID IN (SELECT ID FROM dbo.Table1 GROUP BY ID HAVING COUNT(*) > 1);
DELETE FROM dbo.Table1 WHERE ID IN (SELECT ID FROM dbo.Table1 GROUP BY ID HAVING COUNT(*) > 1);
INSERT INTO dbo.Table1 SELECT ID, Column1, Column2 FROM #TempTable WHERE RowNumber = 1;
DROP TABLE #TempTable;

2、多个字段相同

-- 查找重复记录,只查出重复值和重复计数
SELECT Column1, Column2, COUNT(*) AS 重复计数 FROM dbo.Table1
GROUP BY Column1, Column2 HAVING COUNT(*) > 1;
-- 查找重复记录,可查出重复记录,其中 Column1 + Column2 为字串组合
SELECT * FROM dbo.Table1 WHERE Column1 + Column2 IN (SELECT Column1 + Column2 FROM dbo.Table1
GROUP BY Column1, Column2 HAVING COUNT(*) > 1);
-- 删除多余重复记录,只保留 ID 最大的记录
DELETE t FROM dbo.Table1 t WHERE ID < (SELECT MAX(ID) FROM dbo.Table1
WHERE Column1 = t.Column1 AND Column2 = t.Column2);

3、记录完全相同

-- 查找方法同上
-- 删除多余重复记录,会影响表结构
SELECT DISTINCT * INTO #TempTable FROM dbo.Table1;
DROP TABLE dbo.Table1;
SELECT * INTO dbo.Table1 FROM #TempTable;
DROP TABLE #TempTable;
-- 删除多余重复记录,不影响表结构
SELECT DISTINCT * INTO #TempTable FROM dbo.Table1
WHERE ID IN (SELECT ID FROM dbo.Table1 GROUP BY ID HAVING COUNT(*) > 1);
DELETE FROM dbo.Table1
WHERE ID IN (SELECT ID FROM dbo.Table1 GROUP BY ID HAVING COUNT(*) > 1);
INSERT INTO dbo.Table1 SELECT * FROM #TempTable;
DROP TABLE #TempTable;
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: