Feature transformation is simply a function that transforms features from one representation to another. Feature transformation techniques are used for several reasons:
-
Data types are not suitable to be fed into a machine learning algorithm, e.g. text, categories
-
Feature values may cause problems during the learning process, e.g. data represented in different scales
-
We want to reduce the number of features to plot and visualize data, speed up training or improve the accuracy of a specific model
Sparkflows provides many feature transforming nodes through Spark ML. To find the full list and learn more about them go here: Feature Transformation — Sparkflows 3.0 documentation