Why does Sparkflows add fileName and sheetName at the very end of Read Excel Advanced?

What users observe

When enabled, fileName and sheetName columns always appear after all data columns, never in the middle.


Why Sparkflows does this

Sparkflows treats metadata columns as non-data attributes.

By design:

  • Data columns are inferred or enforced first

  • Metadata columns are appended last

  • Core schema remains stable and predictable

This guarantees:

  • Column order consistency

  • No accidental interference with joins, ML models, or aggregations

  • Deterministic schemas across runs


Why this matters

If metadata were inserted dynamically:

  • Schema would shift when toggled

  • Downstream nodes would break

  • Pipelines would become fragile


Recommendation

Treat fileName and sheetName as traceability fields, not business columns.