Merge Columns and remove duplicates

mack · April 4, 2024, 3:15pm

I possess a collection of image URLs obtained through web scraping. Occasionally, the scraping tool duplicates certain image URLs. In organizing these URLs into columns labeled Image 1, Image 2, and Image 3, I devised a new row named “Images,” where I consolidated them using a delimiter. However, I am encountering difficulty in eliminating duplicate entries within these rows. I know its probably something simple but I’ve tried all the remove duplicates even AI and can’t seem to remove the duplicate Image URLS.

Emory_Stainbrook · April 4, 2024, 10:00pm

Hi @mack - Happy to help! Are the duplicate URLs across the different columns and rows before they’re combined?

mack · April 5, 2024, 1:25pm

Yes, the were in individual columns but in the same row. I split them into one single column using a delimiter.

daniel · April 5, 2024, 6:40pm

Hi @mack,

If I understand correctly, duplicate image URLs appear in your final Images column. To remove duplicates, I recommend using the following steps after your “Edit columns” step:

Remove duplicates (based on the Page_URL)
Unpivot columns (Page_URL is the unique identifier. Pivot the individual Image_URL columns)
Merge duplicate (merge Value based on unique Page_URL)

Note: Make sure these values are checked to guarantee duplicate Image URLs are removed:

Feel free to copy/paste the snippet below to duplicate the step sequence and configuration listed above:
parabola:cb:19429321-5b70-4ff9-bc3f-2b4cbf25f89d

Let me know if this helps!

mack · April 8, 2024, 1:41pm

Thanks so much for helping me out!

Topic		Replies	Views
Help with web scrap data Ask a question	7	874	March 17, 2021
Dedupe and sum merging values Ask a question	6	502	July 17, 2020
Create new column removing duplicate text Ask a question Building-Flows	4	508	August 31, 2021
Rejoin rows after JSON Flattener Ask a question Building-Flows	1	529	June 19, 2020
Dedupe + merge without duplicating values in cells Ask a question Building-Flows	5	1036	July 15, 2020

Merge Columns and remove duplicates

Related topics