It actually does — but not the way you might expect.
Spark DataFrames have no guaranteed row order.
To make row selection deterministic, the node assigns an internal row number using:
-
monotonically_increasing_id() -
followed by
row_number()
This creates a stable processing order, not a semantic one.
Key insight:
Row position here means execution order, not “Excel row number”.
Practical tip
If row order is business-critical, add an explicit sort column node before Select Records.