Why is schema inference done only on the first file in preview mode in Read Excel Advaned?

What users observe

In interactive preview, Sparkflows:

  • Reads only the first Excel file

  • Infers schema from only that file


Why this happens

Preview mode is designed for fast feedback, not full correctness.

To avoid:

  • Long waits

  • Excessive filesystem scans

  • Heavy Excel parsing

Sparkflows:

  • Limits inference to the first file

  • Limits inference to the first sheet

  • Limits rows using sampling rules


Batch mode behavior

In batch execution:

  • All files are read

  • All sheets are processed

  • Full union logic is applied


Key principle

Preview validates structure.
Batch validates truth.