Data Preprocessing vs. Data Wrangling in Machine Learning Projects
Machine learning and deep learning projects are gaining more and more importance in most enterprises. The complete process includes data preparation, building an analytic model and deploying it to production. This is an insights-action-loop which improves the analytic models continuously. Forrester calls the complete process and the platform behind it the Insights Platform. A key task when you want to build an appropriate analytic model using machine learning or deep learning techniques, is the integration and preparation of data sets from various sources like files, databases, big data storage, sensors or social networks. This step can take up to 80 percent of the whole analytics project. This article compares different alternative techniques to prepare data, including extract-transform-load (ETL) batch processing, streaming ingestion and data wrangling.
Mar-6-2017, 11:45:27 GMT