The Problem
Duplicate names are easy to create and annoying to clean. A list can look normal at first, then you notice the same person was entered twice or three times in different places.
Deleting rows immediately can be risky because the reviewer loses the trail of what was removed. A safer first pass is to mark the first appearance and repeated rows, then use the summary to understand how much cleanup is needed before anything gets deleted.
The flow below shows that audit-first approach. The list keeps every row, while the status column explains which names are first appearances and which rows are repeats.
The Problem: Duplicate Cleanup Should Stay Auditable Marking rows first keeps the original list visible while showing what needs review.
In this workbook, the raw name list stays in column A and the status belongs beside each row. The challenge is to mark first appearances as keep, mark later repeats as duplicate, and complete the three summary checks at the bottom.
- The first time a name appears should stay as the kept row.
- Later appearances of the same name should be marked as duplicates.
- The summary should show total names, unique names, and duplicate rows.
That makes the cleanup easier to explain. The sheet can show the original list, the row-level decision, and the totals without hiding the repeated records.
How We Solve It
The easiest way to do this is to count how many times the current name has appeared from the first row down to the current row. If the count is 1, it is the first appearance. If the count is higher, it is a repeat.
Method 1: Progressive COUNTIF
Method 1: Count each name as the list grows to catch repeats as soon as they appear.
This is the method the validator expects. The range starts at $A$2 and expands as the formula is copied down. That means the first time a name appears, the count is 1 and the row is marked Keep. Later appearances become Duplicate.
This solves the challenge directly because the worksheet needs to decide whether each row is the first appearance of a name or a repeated one. The expanding range checks the history up to the current row, which makes the keep-versus-duplicate label possible.
=IF(COUNTIF($A$2:A2,A2)=1,"Keep","Duplicate")
Method 2: MATCH with ROW
Method 2: Compare each row to the first time that name appears in the list.
This version compares the current row number to the position of the first match. It is another solid way to tell whether the current row is the first occurrence or a later duplicate.
This solves the same duplicate problem with a position check instead of a running count. If the current row lines up with the first match for that name, the entry stays; if not, the row must be a repeat.
=IF(ROW(A2)=MATCH(A2,$A$2:$A$9,0)+ROW($A$2)-1,"Keep","Duplicate")
Method 3: Excel's Remove Duplicates tool
Method 3: Use Excel's built-in tool when you just need a one-time cleanup.
The built-in Remove Duplicates tool is fast when you want a final cleaned list right away. The tradeoff is that it removes rows rather than showing the logic in the worksheet, so it is less useful when you want an auditable process.
This solves the one-time cleanup problem, but not the audit problem. It is useful when the goal is simply to remove repeated names, though it does not leave behind a visible rule or row-by-row status like this challenge does.
Function Explanation
1. COUNTIF
COUNTIF counts how many cells match a condition. In this worksheet it tells us whether the current name is showing up for the first time or again.
The expanding range is the key detail. It only looks from the first data row down to the current row, so the first appearance gets counted as one and later repeats get higher counts.
2. IF
IF turns the count into a readable action label. That keeps the results simple for anyone reviewing the list.
Those labels are easier to filter and count than raw occurrence numbers, especially when the list is being reviewed by someone who did not build the formula.
Learn more this functionIF
3. COUNTA
COUNTA counts non-empty cells and gives the total starting size of the list. That helps the summary confirm how many names were reviewed in the first place.
It is the baseline for the cleanup summary: total rows first, then unique rows and duplicate rows from the status labels.
Learn more this functionCOUNTA
If imported names contain extra spaces, duplicate checks can miss matches that look identical to the eye. In real data, trimming the names first is often a good cleanup step.
Mark each row as keep or duplicate, then complete the summary so the worksheet shows the total number of names, the number of unique names, and the number of duplicate rows.