A question I am frequently asked is how to quickly identify errors or duplicates in Microsoft Excel data.
I will show you how this can be done very quickly and easily.
First, I assume that you don’t want to simply remove all duplicated data – if you do, you can use the “Remove Duplicates” tool in the “Data Tools” section of the “Data” tab along the Ribbon.
If you want to keep your data, and simply identify where the duplicates are, here are a couple of tricks:
Assume you have the following data:
- Order your data by the column that you wish to identify the duplicates in. (e.g. in the example above, sort by column A “Letter”
- After you have ordered your data, create a third column with a formula that checks if the cell one line above matches the cell one line below. There are many ways to do this, for example:
-
=A1=A2
– This will return TRUE if the data in cells A1 & A2 match, or it will return FALSE if they don’t=IF(A1=A2,"Duplicate","Unique")
(Same check as above but gives a user friendly message)- Alternatively if duplicated data is an error you could write:
=IF(A1=A2,"Error","OK")
=IF(A1=A2,1,0)
– This is my preferred method because you can select the entire column and see if the sum of the column is greater than 0. If it is, than you know you have duplicates/errors which you then can quickly sort for and find.
Questions? Other tricks/tips? Not what you were looking for? Let me know below in the comments and I’ll get back to you as soon as I can.
Thanks for reading!