Dedupe + merge without duplicating values in cells

Hello,

I am at the last step of my flow where I merge rows with the same ID.
The problem is for lines like this:

The phone number is the same, the e-mail is the same, the type of partner is different.
Thus, I would like one row with the e-mail, the phone, and “typeA,typeB”.

But I get duplicated values in every cell when all I want is for him to merge cells which have different values. The flow does not detect whether the values he is merging are the same. Also, as in most building blocks, I feel like a “ignore blank cells” would be a great addition.


I have zero idea on working my way around this …

Solved it.

There must be simpler ways to do it but I basically did the following:

  • Unpivot the table to create rows which are ID | TYPEOFDATA | VALUE.
  • Merge the three columns to create unique values in a NEWCOLUMN
  • Dedupe NEWCOLUMN, no merging
  • Filter rows with blank values in VALUE
  • Pivot
  • Merge rows according to ID
  • remove duplicate “,” (EDIT: actually that was a bad idea, I sometimes have “valueA, ,valueB”)

Only problem now is that I am not good at regex and I struggle to remove leading, trailing and duplicate commas without removing commas which actually separate values. :sweat_smile:

Capture

Tipically, because dedupe cannot ignore blank cells, this is what I need to clean to only keep one comma separating values: ,,,,emballage@auto-johndoe.com,,,,,firminy.comptabilite@janedoe.com,

EDIT: I managed to solve part of the issue by replacing duplicated commas with single commas, now I only have leading and trailing commas !
EDIT2: I managed to solve the leading comma thanks to another post by using "/^,"
EDIT3: End of the story, I removed the trailing comma with “[,]$” thanks to lots of testing on regexpal

If you guys have a regex idea … thank you !

SOLUTION (might be some simpler ones):

  1. Unpivot the table to create rows which are ID | TYPEOFDATA | VALUE.
  2. Merge the three columns to create unique values in a NEWCOLUMN
  3. Dedupe NEWCOLUMN, no merging
  4. Filter rows with blank values in VALUE
  5. Pivot
  6. Merge rows according to ID
  7. Find double commas and replace with single commas
  8. Regex to replace /^,* and [*,]$ with nothing.

I wouldn’t have had to do this if there was an “ignore blank” and “don’t merge identical values” in the dedupe step. Might be a good feature, but I already asked for a new feature yesterday. :innocent:
[/quote]

Hi @Joachim_Brindeau! Glad that you were able to find a solution, and really appreciate that you documented everything here as well. This all should be immensely helpful for someone who has a similar issue down the line.

As for the Remove duplicate rows step features, those are great ideas! If you haven’t already, please submit them to #feature-requests for others in the community to vote on and our team to review. Thanks!

2 Likes

I just added the feature request here.
Please vote ! :smile:

2 Likes

My workaround brings my max rows from 25k to 250k because of the pivot step… I use way more credits !
I hope the dedupe step gets fixed soon…