Create new column removing duplicate text

Hey Parabola experts,

I hope someone can help me. I have a table with two columns that you will notice duplicate text between both columns, as you can see in the screenshot below and highlight in red:

Based on this scenario, I want to create a new column removing the duplicate text like the screenshot below

Screenshot 2021-08-25 at 09.08.52

Does anyone have any ideas on how to achieve this?

Hi Mateus,

Good question! The flow pictured below uses some of your example data to remove duplicate text between two columns:

Here’s a brief overview on how to set this up:

  1. Use an Insert row numbers step after pulling in your data. This will help combine your newly formatted data with your original dataset for reference.
  2. Use a Split columns step to create a new row for each word in a cell. Split the words using a space and dash delimiter.
  3. Use a Find overlap step to remove words in your accounts title (2) column that also exist in your accounts title column.
  4. Use a Merge duplicate rows step to merge the values in your newly de-duplicated column using a dash delimiter
  5. Use a Combine tables step to join your data back together.

Copy and paste the snippet below to duplicate the steps in this flow:
parabola:cb:992b9b4d1b374bde9ab4b5bd8318be2a

Let me know if that helps!

Thanks, Daniel!

It makes a lot of sense your flow, however, it didn’t work 100%

I copy the flow created but I don’t know why at the end it doesn’t show the information for all rows.

Do you have any idea why it’s not working properly?

Below you will find a link for the data I’m using and a short video showing the steps used.

Data link: Test 76 - Google Drive

video: Vidyard Recording?

Thanks,
Bruno

Hi @Mateus_Coelho,

Can you try removing the Find Overlap in that snippet and replace it with a new one? Then, plug in the top Split columns step first and the bottom Split columns step back into the Find overlap step.

Additionally, it looks like some of your column values have triple dashes --- and special characters with accents. Use a Find and replace step to replace all triple dashes --- with a single dash - and all special characters with a standard vowel.

Let me now if that helps!

Thanks @daniel!

I had to do a massive data clean-up but the workflow worked in the end.

1 Like