What users observe
In interactive preview, Sparkflows:
-
Reads only the first Excel file
-
Infers schema from only that file
Why this happens
Preview mode is designed for fast feedback, not full correctness.
To avoid:
-
Long waits
-
Excessive filesystem scans
-
Heavy Excel parsing
Sparkflows:
-
Limits inference to the first file
-
Limits inference to the first sheet
-
Limits rows using sampling rules
Batch mode behavior
In batch execution:
-
All files are read
-
All sheets are processed
-
Full union logic is applied
Key principle
Preview validates structure.
Batch validates truth.