I have a huge list and I need to exclude records that contain any of 24 values. Is there a faster way to do this than adding “Filter” step in the flow and saying “Keep” based on a filter of exclusion?
Hey @OV
Good question! The easiest way to do this in Parabola is to use a “Filter rows” step.
Another option is to import a separate list of exclusionary values and cross-reference them with your original data set. Here’s how you might set that up:
- Import your original dataset and a list of your exclusionary values.
- Use two “Insert Text Column” steps to create a column called
Join
and fill it with the value of1
. - You can reorder your columns using the “Reorder columns” step. This is optional.
- Merge your datasets using the “Combine tables” step. Merge on the shared
Join
column.
At this point, you’ll likely have a bunch of duplicate rows. This is where we can use “Insert if/else column” step to cross-reference our exclusionary values with our records.
Create a new column called Match
if the Records
column contains the values from the {Values}
column. If there is a match, set the column value to Yes
.
- Use a “Filter rows” step to keep rows in the Match column that equal
Yes
.
The resulting data should show a list of records that you want to remove since they contain an exclusionary value.
- The Find overlap step will keep all
Records
from your original data source that do not exist in yourRecords
column after filtering for exclusionary rows.
Still, filtering rows is likely your best method. However, if that list of exclusionary values grows, this will certainly handle that without the need to continually update that step!
Can you elaborate on the settings in Step 4 above?
I’m getting this message
Issue ID: 07870462-8197-4675-a3ef-27cfb7a949bb