Removing duplicate rows in SQL Server?

350300 pts.
Tags:
SQL Server
SQL tables
I’m working with a large table (450,000+ rows). Many of the rows contain duplicate data and I’d like to be able to pull the dupes out. What is the best way to do this knowing that the rows are not fully duplicates due to the RowID identity field.

MyTable
-----------
RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null

Answer Wiki

Thanks. We'll let you know when a new response is added.

On version 2005 and above, something like this should do the job:

WITH cte
     AS (SELECT ROW_NUMBER() OVER (PARTITION BY Col1, Col2, Col3 
                                       ORDER BY RowID) rownum
         FROM   YourTable)
DELETE FROM cte
WHERE  rownum > 1;

On previous versions you could probably do it using OUTER JOINs, NOT INs or EXISTSs.

Discuss This Question:  

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following