Trying to understand Dedupe logic

Link_Ervin · March 20, 2020, 11:38am

Using Parabolas description literally: Remove duplicate rows based on a key column

The base functionality of this step is easy to grasp - remove duplicates. To do that, you need to specify the column that should be used to determine if any rows are duplicates of each other.

I am trying to remove duplicate phone numbers from records. In an effort to simplify it to the most basic level I’ve split the phones column into rows and included name, address and 1 phone per row as column values.
As an example:
John Doe 123 Main St 704-555-1212
John Doe 123 Main St 704-555-1234
John Doe 123 Main St 704-555-1212

Then I’ve asked for a dedupe on the phone column values which should eliminate any row that would have a duplicate phone but it’s not. From my understanding, in the example above I should end up with 2 rows remaining (the second 555-1212 phone row should be removed)?
I would appreciate any clarification on this logic. Thanks

brian · March 20, 2020, 5:51pm

Hey Link! I would also assume that you should get 2 rows from that example. Perhaps there is a space in one of the rows that is not in the other? Sometimes leading spaces are tricky to see.

Try using a Find Replace step before the dedupe to remove spaces, dashes, and any other formatting that may exist. Then use the dedupe step and see if that helps!

Topic		Replies	Views
Create new column removing duplicate text Ask a question Building-Flows	4	508	August 31, 2021
Dedupe + merge without duplicating values in cells Ask a question Building-Flows	5	1036	July 15, 2020
How do I handle rows with partially duplicate data Ask a question	8	548	August 21, 2020
Merge text based in another column value Ask a question	3	455	March 8, 2024
Can't Select Columns in Remove Duplicate Rows Step Ask a question Building-Flows	2	449	October 5, 2021

Trying to understand Dedupe logic

Related topics