Search for duplicates in a collumn and delete the row on which it occurs

20 pts.
Microsoft Excel
Microsoft Excel formulas
microsoft excel functions
Microsoft Excel macros
Hi, I have a spreadsheet with list of identifiers in collumn 1, this is auto generated and due to the source info some of them duplicated, what I want to do is find the duplicates and remove the entire row they appear on. I know this is possible using clipboard and filters but I would rather avoid user error and use a macro. Any help will be appreciated Thanks

Answer Wiki

Thanks. We'll let you know when a new response is added.

Hi Harleq,

What I would probably do is use VBA to sort the rows, then start at cell (1,1). For each cell, look to see if the next row has the same value. If it does, delete the row (the current row or the next one, depending on how you want to do it). If it doesn’t, increment to the next row. (Don’t increment if you deleted a row; the deletion acts as an increment anyway).

If you don’t want the information re-sorted, I think your best bet is to increment over the rows, storing each row’s identifier cell’s value in an array. For each row, if the ID is already in that array, delete the row; if not, add the ID to the array and go to the next row. If you have a large data set you’re working on, you may want to keep the array sorted or hashed to make all those lookups quicker.

HTH, let me know if you need more specific help.

OR, Assume A = ID column Then -(1) Insert another column (B)
(2) Sort on column (A)
(3) In column row 2 – enter equation = A2-A1 and copy down
Any row with a zero in Column B is a duplicate.
Then, if a massive spreadsheet (1) Save as a .CSV file (2) Load the .CSV file (3) Sort on Column B and all the bad rows will sort to the top

Discuss This Question: 1  Reply

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.
  • YuvalShavit
    That's a great idea, Nozlop. If the ID column is not purely numeric (ie, if an ID could be something like "A81BL") then you can replace the =A2-A1 with something like =IF(A2=A1,1,0). That will enter a 1 if the ID column is equal to the one above it (ie, there's a duplicate), and 0 if not. Also, instead of saving as a separate file and sorting, you can select the whole of column B (where the new equation is) and copy. Then paste special and have it paste values (instead of "all"). Column B will now have the values it had before, but without the associated equations behind them. Now you can sort on column B and delete all the duplicates.
    905 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.


Share this item with your network: