Introduction
We’ve all been in the situation where we’re looking at a Microsoft Excel document with thousands of rows that’s loaded with duplicate records we have to somehow remove. Thankfully, Excel makes it pretty easy to find and remove duplicate rows without having to do it line by line.
Some of the images associated with the steps are included inline below. All images associated with these steps can be seen in the embedded YouTube video. Also, please note that you’re not able to leave a comment directly on this article. If you have a question or feedback, please leave it on the YouTube video.
Steps to Remove Duplicate Rows in Excel
- Open the sheet in Excel you want to remove the duplicate rows from. In this example I’ve put together an inventory of my old video games in an attempt to catalog my collection. But we can see that I messed up and there are some duplicates I’ve highlighted for reference in this data set. And just so we have a point to validate against at the end, a quick count shows I’ve got 8 duplicate records.
- Highlight the table you want to remove duplicate rows from in Excel.
- Click “Data” in the menu along the top of the screen, and then click “Remove Duplicates” in the Data Tools section. The Remove Duplicates dialog box is displayed.
- If the selected data table you want to remove duplicates from has column headers, make sure to click to place a check mark next to “My Data Has Headers.”
- Next, click to place a check mark next to the columns you want to check for duplicates and then click “OK.”
When you do this, Excel will remove duplicates that match on just those columns, so you have to be careful. Just to show it can be dangerous to choose the wrong column headers, if I choose to remove duplicates based on the Years Released and Console columns, a lot of rows with unique game titles would be removed because Excel only kept a single record associated with each year and console pair.
So I’ll undo the removal of those rows to get my data back, and will navigate to the Remove Duplicates window. And this time I’ll choose all of the column headers and then click “OK” again.
And you can see this gives me the result I would expect, with eight duplicate records found and 17 left. And you can see in looking through the table that there are no duplicate records remaining. As a result, if you’re looking to remove duplicate rows that match on all data points, make sure you choose all of the column headers in the Remove Duplicates dialog box.