How do I split my data into unique and duplicate records in Sparkflows?

In Sparkflows, there is a Find Duplicate node that can perform the exact operation you’re looking for. You can specify the column(s) based on which you want to determine uniqueness. Simply add this node to your input dataset and specify the column(s) you want to be unique. As a result, you will get two dataframes: the higher edge will contain all the unique data, while the lower edge will contain the duplicate records that were found.