Big-Data Pipelines with SparkML
Pipelines are a simple way to keep your data preprocessing and modeling code organized. Specifically, a pipeline bundles preprocessing and modeling steps so you can use the whole bundle as if it were a single step. So, a Pipeline is a convenient process of designing our data preprocessing and Machine Learning flow. There are certain steps that we must do before the actual ML begins. These steps are called data-preprocessing and/or feature engineering.
Dec-6-2020, 08:05:44 GMT