How to filter list based on "contains" with list of values?

I have a huge list and I need to exclude records that contain any of 24 values. Is there a faster way to do this than adding “Filter” step in the flow and saying “Keep” based on a filter of exclusion?

Hey @OV :wave:

Good question! The easiest way to do this in Parabola is to use a “Filter rows” step.

Another option is to import a separate list of exclusionary values and cross-reference them with your original data set. Here’s how you might set that up:

  1. Import your original dataset and a list of your exclusionary values.
  2. Use two “Insert Text Column” steps to create a column called Join and fill it with the value of 1.
  3. You can reorder your columns using the “Reorder columns” step. This is optional.
  4. Merge your datasets using the “Combine tables” step. Merge on the shared Join column.

At this point, you’ll likely have a bunch of duplicate rows. This is where we can use “Insert if/else column” step to cross-reference our exclusionary values with our records.

Create a new column called Match if the Records column contains the values from the {Values} column. If there is a match, set the column value to Yes.

  1. Use a “Filter rows” step to keep rows in the Match column that equal Yes.

The resulting data should show a list of records that you want to remove since they contain an exclusionary value.

  1. The Find overlap step will keep all Records from your original data source that do not exist in your Records column after filtering for exclusionary rows.

Still, filtering rows is likely your best method. However, if that list of exclusionary values grows, this will certainly handle that without the need to continually update that step!

1 Like