Monday, March 10, 2014

Menu Monday - Remove Duplicates

Removing duplicates from a large list of data can be a daunting task, but now with the 'Remove Duplicates' function, it can be breeze.

To use the function, follow the steps below:

  • Highlight the cells where you want the duplicates removed
  • Click on the Data tab on the Ribbon (Menu)
  • Click on 'Remove Duplicates' in the Data Tools section
  • A Remove Duplicates window will pop-up
  • Make any adjustments as needed (see examples below)
  • Click OK
  • A window will pop-up to tell you how many duplicate values were removed and how many unique values remain
  • Click OK
  • Now you will see your list with only unique values
 
Example 1 - Single List

  • In this scenario, I have a single list of data containing first names.  I want only the unique values and want to remove any duplicates.


  • First, highlight cells A1:A10 (I could also just highlight all of column A by clicking on A if this is the only data in this column)


  • Click on the Data tab on the Ribbon (Menu)
  • Click on 'Remove Duplicates in the Data Tools section
  • A Remove Duplicates window will pop-up

  • In this case I only have 1 column of data, so I will leave the "First Name" column checked
  • I also had included the data header in the cells that I highlighted, so I will leave the "My data has headers" box checked
    • Note: If you do not have a header or did not include the header in your data, you will want to uncheck this box
  • Click OK
  • A window pops-up say that 3 duplicate values were found and removed and 6 unique values remain


  • Click OK
  • Now the list only contains 6 names instead of 9

  • Below is a visual of the data an what was removed
 

Example 2 - Multiple Columns of Data, Compare all columns
  • In this scenario, I have multiple columns of data containing names and addresses.  I want to remove the duplicates where the values in every column are duplicated.
 
  • First, highlight cells A1:E10 (I could also just highlight all of columns A-E if this is the only data in this column)
    • Note: Highlight all of the adjacent data so that it removes the entire row, even if you are not checking each column for duplicates

  • Click on the Data tab on the Ribbon (Menu)
  • Click on 'Remove Duplicates' in the Data Tools section
  • A Remove Duplicates window will pop-up
  • In this case I only want to remove the row if the data in each column is a duplicate, therefore I want to check all the boxes for each column (i.e. There are 3 Susie Smiths, but I only want to delete the data if the First Name, Last Name, Address, City and State are exactly the same.  So, in this scenario, only 1 of the Susie Smiths will be deleted as there are 2 with the same address and 1 with a different address.)
    • Note: See Example 3 if you don't want to compare all columns
  • I also had included the data header in the cells that I highlighted, so I will leave the "My data has headers" box checked
    • Note: If you do not have a header or did not include the header in your data, you will want to uncheck this box
  • Click OK
  • A window pops-up say that 2 duplicate values were found and removed and 7 unique values remain
 
  • Click OK
  • Now the list only contains 7 names instead of 9
    • Below is a visual of the data an what was removed


Example 3 - Multiple Columns of Data, Compare only First Name and Last Name

  • In this scenario, I have multiple columns of data containing names and addresses.  I want to remove the duplicates where the values in the First Name and Last Name columns are duplicated.
  • First, highlight cells A1:E10 (I could also just highlight all of columns A-E if this is the only data in this column)
    • Note: Highlight all of the adjacent data so that it removes the entire row, even if you are not checking each column for duplicates
  • Click on the Data tab on the Ribbon (Menu)
  • Click on 'Remove Duplicates' in the Data Tools section
 
  • A Remove Duplicates window will pop-up
  • In this scenario, I only want the boxes checked for First Name and Last Name as I am only comparing to see if there are duplicates in these columns (i.e. There are 3 Susie Smiths, so I want Excel to remove 2 of them even though 1 of the addresses is different)
  • I also had included the data header in the cells that I highlighted, so I will leave the "My data has headers" box checked
    • Note: If you do not have a header or did not include the header in your data, you will want to uncheck this box
  • Click OK
  • A window pops-up say that 3 duplicate values were found and removed and 6 unique values remain
  • Click OK
  • Now the list only contains 6 names instead of 9
 
  • Below is a visual of the data an what was removed
 
Excel ya later!

No comments:

Post a Comment