The workflow reads in a dataset containing houses listed for sale and uses K-Means Clustering from Apache Spark ML to group listings.
Below is the workflow for creating a K-Means model for clustering the houses. It does the following:
- Reads data from a sample dataset.
- Prints the result.
- Assembles the features for prediction.
- Splits it.
- Perform K-Means Clustering.
- Prediction.
- Print the prediction result.
