How to Find and Highlight Duplicates in Excel: A Step-by-Step Guide

Key takeaways:

  • To highlight duplicates in Excel, use Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values for a fast one-range check.
  • Use COUNTIF when you need to highlight only the second and later duplicate values, or use custom rules for one or two columns.
  • Use Power Query when the duplicate check is large, repeatable, or needs cleanup before review.
  • Use RowSpeak when duplicates span multiple files, mixed .xlsx and .csv exports, messy values, or business rules that are hard to express in one Excel formula.

To highlight duplicates in Excel, select your range, go to Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values, choose a format, and click OK.

That is the fastest answer when you need to highlight duplicate values in one column or one selected range. It is also non-destructive: Excel colors the duplicate cells so you can review them before deciding whether to keep, merge, or remove duplicates in Excel.

The harder cases start when your file is not clean. You may need to highlight duplicate rows, compare two columns, ignore case, trim spaces, compare an Excel workbook with a CSV export, or find the same customer across several files. In those cases, RowSpeak can help you upload the files, describe the duplicate rule in plain English, review the highlighted results, and export a new Excel file.

Quick Decision Tree

Your duplicate problem Best starting point
One column, exact same visible value Conditional Formatting
Keep the first record unmarked and flag later repeats COUNTIF
Duplicate means two or more columns match COUNTIFS
50,000+ rows, recurring imports, or heavy cleanup Power Query, RowSpeak
Excel says nothing is duplicated, but the values look the same RowSpeak
Multiple files, mixed .xlsx and .csv, fuzzy rules, or source-file tracking RowSpeak

Want to test the edge cases instead of using a clean toy sheet? Download the messy duplicate sample TSV. Excel can open it directly, and it includes trailing spaces, casing differences, phone-number punctuation, mixed date formats, leading zeros, and vendor suffix variations.

Have more than one file or a messy duplicate rule? Highlight duplicates with RowSpeak.

Quick Answer: How to Highlight Duplicates in Excel

If you only need the standard Excel workflow, do this:

  1. Select the cells you want to check.
  2. Go to Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values.
  3. In the dialog box, choose Duplicate.
  4. Pick a fill or text color.
  5. Click OK.

If you prefer to see the basic menu path first, this short YouTube walkthrough shows the same Conditional Formatting workflow in Excel:

Excel will highlight duplicate cells in the selected range. This works well for simple checks such as duplicate emails, order IDs, invoice numbers, product names, or customer IDs in one sheet.

Method 1: Highlight Duplicate Values with Conditional Formatting

Conditional Formatting is the best starting point for most Excel users because it is built in, quick, and easy to reverse.

Select the range of cells you want to check.

Selecting a range before highlighting duplicates in Excel

Go to Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values.

Opening the Duplicate Values rule in Excel Conditional Formatting

Choose how Excel should format the duplicate values. For example, you can use Green Fill with Dark Green Text.

Choosing a format for duplicate values in Excel

Click OK. Excel highlights the duplicate values in the range.

Duplicate values highlighted with Excel Conditional Formatting

Use this method when your rule is simple: the exact same value appears more than once in the selected range.

Use something else when your definition of duplicate is more specific. Conditional Formatting will not automatically understand that ACME Inc, Acme, Inc., and ACME Incorporated may be the same account. It also does not solve cross-file checks by itself.

Method 2: Highlight Only the Second and Later Duplicates with COUNTIF

Excel's built-in duplicate rule highlights all repeated values, including the first occurrence. If you want to keep the first record unmarked and highlight only the second and later occurrences, use a formula rule.

Select your range, then go to Home > Conditional Formatting > New Rule.

Creating a new conditional formatting rule in Excel

Choose Use a formula to determine which cells to format. If your data starts in A2, use:

=COUNTIF($A$2:$A2,$A2)>1

Then choose your format and click OK.

Using COUNTIF to highlight later duplicate occurrences in Excel

Excel now highlights only the repeated occurrences after the first value.

Duplicate values highlighted except the first occurrence in Excel

This is useful for lists where the first entry should remain the primary record, such as the first customer registration, first invoice number, or first lead record.

If your data is in a table or a different starting row, adjust the formula so the first reference locks the start of the range and the second reference expands as Excel evaluates each row.

Performance tip: avoid full-column formula rules on very large sheets unless you have to. On 50,000+ rows, expanding COUNTIF rules can make recalculation noticeably slower; on 100,000+ rows with multiple rules, Power Query or a review workflow in RowSpeak is usually easier to maintain. If you stay in Excel formulas, use a bounded range or an Excel Table instead of $A:$A.

Method 3: Highlight Duplicate Rows or Two-Column Duplicates

Many business duplicate checks are not about one cell. You may need to find rows where two or more fields match, such as:

  • First Name + Last Name
  • Customer ID + Order Date
  • Vendor Name + Invoice Number
  • Email + Phone
  • SKU + Warehouse

For two columns, create a formula-based conditional formatting rule. Suppose your data starts on row 2 and you want to highlight duplicate combinations of columns A and B. Select the rows or columns you want to format, then use:

=COUNTIFS($A:$A,$A2,$B:$B,$B2)>1

This tells Excel to highlight a row when the same pair of values appears more than once.

To highlight only the second and later duplicate row combinations, use an expanding range:

=COUNTIFS($A$2:$A2,$A2,$B$2:$B2,$B2)>1

This method is flexible, but the formula must match your real rule. If a duplicate customer can be matched by email first, then phone if email is missing, or by a normalized company name, the formula becomes harder to maintain. That is where a plain-language workflow can be faster.

Performance tip: COUNTIFS across full columns is convenient, but it is expensive when the file has many rows and several duplicate rules. For large order exports, AP ledgers, or SKU masters, limit the formula to the actual data range or move the repeatable check into Power Query.

Method 4: Use Power Query for Large or Repeatable Files

Power Query is useful when the duplicate check is part of a recurring data-cleaning workflow. It can import data, transform columns, keep duplicate rows, and load the result into a new sheet.

Select a cell or range, then go to Data > Get & Transform Data > From Table/Range.

Opening a table in Power Query from Excel

Confirm the table range and whether your data has headers.

Confirming the Excel table range before Power Query

In Power Query Editor, select the column or columns you want to check.

Selecting columns in Power Query Editor for duplicate checks

Then go to Home > Keep Rows > Keep Duplicates.

Keeping duplicate rows in Power Query

Click Close & Load to return the results to Excel.

Loading Power Query duplicate results back to Excel

Power Query is a strong option for large files and repeatable cleanup, but it usually creates a duplicate-only table instead of simply highlighting the original records. It also expects the user to understand the query steps. If the business rule changes every week, the setup can become slower than the review itself.

If you do not want to maintain formulas or query steps, you can use RowSpeak to describe the rule, ask for a review sheet, and export the result as an Excel workbook.

Method 5: Use RowSpeak AI for Real-World Duplicate Checks

RowSpeak workspace for spreadsheet duplicate review

RowSpeak is useful when the duplicate task is bigger than one Excel menu click. Instead of building a chain of formulas, imports, helper columns, and manual checks, you can describe the rule you want and ask for a reviewable output.

A typical RowSpeak workflow looks like this:

  1. Upload the spreadsheet, CSV export, or set of files you want to check.
  2. Tell RowSpeak which fields define a duplicate.
  3. Ask it to keep the original data unchanged and create a duplicate review sheet.
  4. Review the highlighted records, match rule, and recommended action.
  5. Export the result as an .xlsx file.

Example command:

Using olist_customers_dataset.csv, highlight customer_unique_id values that appear under more than one customer_id. Keep the first occurrence unmarked, highlight later occurrences, and create a Duplicate Review sheet with customer_unique_id, customer_id, customer_city, customer_state, and duplicate_group_id.

This is especially helpful when you already know the business rule but do not want to translate it into nested formulas. For more examples, see the RowSpeak command writing guide.

RowSpeak Scenarios: Multi-File, Mixed Format, Complex Rules

For the examples below, you can use the public Olist e-commerce dataset hosted on Hugging Face. Download these three files before testing:

Test file Why it is useful Download
olist_customers_dataset.csv Customer identity fields such as customer_id, customer_unique_id, city, and state Download customers CSV
olist_orders_dataset.csv Order-level records with order_id, customer_id, status, and timestamps Download orders CSV
olist_order_items_dataset.csv Item-level rows with order_id, product_id, seller_id, price, and freight Download order items CSV

You can also open the Olist dataset file list if you want to inspect the source page before downloading.

1. Find duplicates across multiple Excel files

If the same customer or order appears across several exports, highlighting one sheet is not enough. You first need to combine the files, align the columns, preserve source context, and then run the duplicate rule. For a real test, use the public Olist e-commerce files: olist_customers_dataset.csv, olist_orders_dataset.csv, and olist_order_items_dataset.csv.

Use RowSpeak when you need a review workbook that shows where each duplicate came from:

Compare olist_customers_dataset.csv and olist_orders_dataset.csv. Find customer_unique_id values that map to multiple customer_id values and show which orders belong to each duplicate customer group. Create a Duplicate Review sheet with customer_unique_id, customer_id, order_id, order_status, customer_city, customer_state, source_file, and recommended_action.
Compare olist_orders_dataset.csv and olist_order_items_dataset.csv. Highlight order_id values that appear in both files and flag order_id + product_id + seller_id combinations that appear more than once in order_items. Create a review sheet that separates normal multi-item orders from possible duplicate item rows.

The goal is a non-destructive review: original records preserved, duplicate groups labeled, and source files visible before anyone deletes or merges data.

2. Compare .xlsx and .csv files for duplicates

Duplicate checks often cross file formats. One team sends an Excel workbook, another system exports a CSV, and the column names do not match perfectly. RowSpeak supports common spreadsheet formats including .xlsx, .xls, and .csv, so it can fit workflows where you would otherwise import, reformat, and combine files manually. See the supported file formats reference for the full list.

For your own test, save olist_orders_dataset.csv as olist_orders_dataset.xlsx, then upload it with olist_customers_dataset.csv and olist_order_items_dataset.csv.

Use a command that maps the fields explicitly:

Compare olist_orders_dataset.xlsx with olist_order_items_dataset.csv. Use order_id as the join key. Highlight order_id values with multiple item rows, and create a summary showing item_count, seller_count, and product_count for each order.
Compare olist_customers_dataset.csv with olist_orders_dataset.xlsx. Use customer_id as the join key, then group by customer_unique_id to show customers with more than one order or more than one customer_id. Keep source_file in the output.

This helps with CRM exports, ecommerce orders, finance files, ad platform exports, and other workflows where the duplicate risk lives between systems.

3. Use complex duplicate rules

A duplicate is not always an exact match. In real files, the same person or household may appear under more than one customer record, and the same order may have several item rows that are legitimate, not duplicates. With the Olist files, customer_unique_id is useful for finding repeated customer identities, while order_id + product_id + seller_id is better for detecting suspicious repeated item rows.

Instead of forcing all of that into one formula, state the rule:

In olist_customers_dataset.csv, find customer_unique_id values that appear more than once. Treat those as repeat customer identities, not rows to delete automatically. Highlight the later customer_id records and create a review sheet with city, state, and linked order count.
In olist_order_items_dataset.csv, flag possible duplicate item rows only when order_id, product_id, seller_id, price, and freight_value all match. Do not flag an order_id as a duplicate only because the order has multiple products.
Join olist_orders_dataset.csv and olist_order_items_dataset.csv by order_id. For each order, count item rows and distinct products. Highlight orders where the same product_id appears more than once with the same seller_id and price.

For sensitive finance, HR, or customer data, use anonymized samples for testing and follow your organization's data-handling rules. If you need controlled deployment boundaries, review RowSpeak's private deployment options.

4. Clean the data before checking duplicates

Many duplicate misses happen because values only look the same. Excel may not treat them as equal if there are hidden characters, inconsistent case, dates stored as text, or phone numbers written in different formats.

RowSpeak can fit a combined data cleaning and duplicate review workflow:

Clean the Olist customer and order files before checking duplicates. Trim text fields, standardize city and state casing, confirm order_id and customer_id are treated as text, then find customer_unique_id values linked to multiple customer_id records.
Before checking duplicate item rows, clean olist_order_items_dataset.csv by treating order_item_id as a number and order_id, product_id, and seller_id as text. Then highlight repeated order_id + product_id + seller_id + price combinations and create a summary of how many possible duplicate item rows were found.

Ask for a separate review sheet so you can inspect the result before changing your working file.

Prompt Examples You Can Copy

Use these prompts as starting points and replace the column names with your real headers.

Using olist_customers_dataset.csv, highlight customer_unique_id values that appear under more than one customer_id. Keep the first customer_id unmarked and highlight the later customer_id records for review.
Compare olist_customers_dataset.csv and olist_orders_dataset.csv. Join on customer_id, group by customer_unique_id, and create a review sheet showing repeat customers, order_count, order_status values, customer_city, and customer_state.
Compare olist_orders_dataset.csv and olist_order_items_dataset.csv. Highlight order_id values with multiple item rows, but separate normal multi-product orders from possible duplicate rows where order_id + product_id + seller_id + price are identical.
Use olist_order_items_dataset.csv to find possible duplicate order item records. Treat rows as possible duplicates only when order_id, product_id, seller_id, price, and freight_value all match. Highlight exact duplicate item rows in red and normal multi-item orders in blue.
Create a Duplicate Review sheet with duplicate_group_id, matched_fields, match_rule, source_file, confidence, and recommended_action. Keep the original sheet unchanged.

Comparing the Methods

Method Best for Strength Limitation
Conditional Formatting Simple duplicate cells in one range Fastest built-in answer Limited for multi-column, messy, or cross-file rules
COUNTIF / COUNTIFS Custom formulas and second-occurrence rules Good control inside Excel Fragile when rules get complex
Power Query Large or repeatable cleanup workflows Strong import and transformation tools More setup, steeper learning curve
RowSpeak Multi-file, mixed-format, messy, or rule-based duplicate checks Plain-language rules and reviewable outputs You still need to review results before deleting or merging records

Excel Conditional Formatting is enough for simple one-column checks. Use RowSpeak when the work becomes a file-based review problem rather than a single formatting rule.

Troubleshooting: Why Excel Is Not Highlighting Duplicates

The values have extra spaces

"Acme" and "Acme " look similar, but they are not the same value. Use TRIM() in Excel or ask RowSpeak to trim spaces before checking duplicates.

The values contain hidden characters

Copied data from PDFs, websites, CRM exports, and accounting systems can contain non-printing characters. In Excel, try CLEAN() and TRIM(). In RowSpeak, ask for a cleaning step before duplicate detection.

Dates or numbers are stored as text

Excel may treat 00123, 123, and a text-formatted 123 differently depending on the column format and rule. Standardize the field before checking duplicates.

The duplicate rule needs more than one column

If the duplicate definition is "same email and same order date," the basic Duplicate Values rule is too broad. Use COUNTIFS, Power Query, or a RowSpeak prompt that names the fields.

You need to find duplicates before removing them

Highlight first, review second, delete or merge last. Removing duplicates can permanently delete rows from the active dataset, so keep a backup or create a review sheet first. If you are ready for the deletion workflow, read the guide to remove duplicates with AI.

FAQ

How do I highlight duplicate values in Excel?

Select the range, then go to Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values. Choose a format and click OK.

How do I highlight duplicate cells in one column in Excel?

Select the column cells you want to check, then apply the Duplicate Values conditional formatting rule. If you do not want to include the header, select only the data cells.

How do I highlight duplicate rows in Excel?

Use a formula-based conditional formatting rule with COUNTIFS. For example, if columns A and B define a duplicate row, use =COUNTIFS($A:$A,$A2,$B:$B,$B2)>1.

How do I highlight duplicates in two columns in Excel?

If you want duplicate combinations, use COUNTIFS across both columns. If you want to find values that appear in either column, use a separate formula that checks each value against the combined range.

How do I highlight duplicates except the first occurrence?

Use an expanding COUNTIF formula such as =COUNTIF($A$2:$A2,$A2)>1. This keeps the first value unmarked and highlights later repeats.

Can I compare a CSV and Excel file for duplicates?

Yes. You can import the CSV into Excel and build a formula or Power Query workflow. If the files have different column names or need cleanup first, upload them to RowSpeak and describe how the fields should map.

Can RowSpeak remove duplicates instead of only highlighting them?

Yes, you can ask RowSpeak to create a cleaned output, but the safer workflow is to highlight duplicates and create a review sheet before deleting anything. Use wording such as: "Do not delete rows until I review the duplicate groups."

Final Thoughts

If you only need to highlight duplicate cells in one Excel range, Conditional Formatting is the right tool. It is fast, free, and already inside Excel.

If your duplicate check involves multiple columns, multiple files, mixed .xlsx and .csv exports, hidden spaces, inconsistent casing, or business rules that need review, use RowSpeak as the next step. Upload the file, state the rule, ask for a non-destructive review sheet, and export a workbook your team can inspect.

Have more than one file or a messy duplicate rule? Highlight duplicates with RowSpeak and create a reviewable Excel output before you delete anything.

Ditch Complex Formulas – Get Insights Instantly

No VBA or function memorization needed. Tell RowSpeak what you need in plain English, and let AI handle data processing, analysis, and chart creation

Try RowSpeak Free Now

Recommended Posts

6 Smart Ways to Find and Remove Duplicates in Excel
Data Cleaning

6 Smart Ways to Find and Remove Duplicates in Excel

Duplicate entries can ruin your data analysis. This comprehensive guide walks you through everything from Excel's built-in 'Remove Duplicates' feature and Power Query to a game-changing AI method that cleans your data with a simple command. Find the perfect technique for your needs.

Ruby
Stop Wrestling with Excel's SEARCH Function: How to Find and Extract Text with AI
Data Cleaning

Stop Wrestling with Excel's SEARCH Function: How to Find and Extract Text with AI

Struggling with messy text data in Excel? Tired of writing complex nested formulas with SEARCH, MID, and LEFT just to extract a username or a middle name? Discover how an Excel AI agent like RowSpeak can do it all with simple language commands, saving you hours of frustration.

Ruby
From Messy Export to Insightful Report: How Excel AI Beats Power Query
Data Cleaning

From Messy Export to Insightful Report: How Excel AI Beats Power Query

Manually cleaning messy sales data in Excel is a nightmare. This guide shows you how to ditch the tedious Power Query steps and use RowSpeak to automatically clean your data and build complex pivot tables with simple chat commands.

Ruby
Tired of Messy Data? Clean and Transform Your Excel Files with AI Instead of Power Query
Data Cleaning

Tired of Messy Data? Clean and Transform Your Excel Files with AI Instead of Power Query

Tired of spending hours cleaning messy Excel files? From splitting text to unpivoting tables, manual data prep is a drag. Discover how an Excel AI agent like RowSpeak can replace complex Power Query steps with simple language commands, saving you time and eliminating errors.

Ruby
A Pro's Guide to Fixing Spacing Issues in Excel - TRIM vs. AI
Data Cleaning

A Pro's Guide to Fixing Spacing Issues in Excel - TRIM vs. AI

Extra spaces in your Excel data can cause major headaches, from failed lookups to broken calculations. Learn the traditional way to fix them with the TRIM() function and its advanced combinations. Then, discover a smarter, faster AI-powered method to clean your data with simple English commands.

Ruby
Tired of Complex FIND & LEFT Formulas? Extract Text in Excel with Simple Language
Data Cleaning

Tired of Complex FIND & LEFT Formulas? Extract Text in Excel with Simple Language

Tired of wrestling with complex nested formulas like FIND, LEFT, and MID just to clean up text in Excel? Discover how to stop wasting hours and start using simple language to extract usernames, split names, or replace text in seconds with an Excel AI agent.

Ruby
Stop Fighting Excel Number Formats: How AI Can Fix Your Data in Seconds
Data Cleaning

Stop Fighting Excel Number Formats: How AI Can Fix Your Data in Seconds

Your sales report has mixed-up dates and currencies, and it's a mess. Instead of clicking through endless 'Format Cells' menus, what if you could just tell Excel what you want? Discover how RowSpeak's AI turns tedious formatting tasks into a simple conversation.

Ruby
Stop Manually Creating Excel Drop-Down Lists: Let AI Do It For You
Data Cleaning

Stop Manually Creating Excel Drop-Down Lists: Let AI Do It For You

Manually creating Excel drop-down lists is tedious and error-prone, especially for dynamic or dependent lists. Discover how an Excel AI like RowSpeak can automate the entire process with simple natural language, saving you hours and ensuring data integrity.

Ruby