How can I reliably trace bad values back to the exact Excel cell?

The problem

Without traceability, bad data becomes impossible to debug.


Sparkflows’ approach

Enable:

  • Output File Name as Field

  • Output Sheet Name as Field

  • Use CellRange when applicable

Together, these allow you to:

  • Identify the exact Excel file

  • Identify the exact sheet

  • Narrow down the row range precisely


Practical debugging flow

  1. Filter rows with unexpected nulls

  2. Inspect fileName and sheetName

  3. Open the source Excel

  4. Navigate to the specific range


Recommendation

Always enable metadata when:

  • Reading directories

  • Processing user-generated Excel

  • Building regulated pipelines