In Sparkflows, we can use the Multi Windows Ranking processor to get ranking within a partition. First it would create a partition by department and location. Then it would rank based on salary.
To use the ‘Multi Windows Ranking’ Processor, do the following:
-
Select Rank in ‘Windows Function’. It would output a rank value.
-
Enter the columns used for partitioning the dataset in ‘PartitionBy’. In this case it would be ‘department’, ’location’.
-
Enter the columns used for sorting the dataset in ‘Order By’. In this case it would be ‘Salary’.
-
Enter Output column to list the output in the outgoing DataFrame. It would contain rank value within a partition.
For more information, read the Sparkflows Documentation here: