Data Prep for Machine Learning: Missing Data -- Visual Studio Magazine
Turning his attention to the extremely time-consuming task of machine learning data preparation, Dr. James McCaffrey of Microsoft Research explains how to examine data files and how to identify and deal with missing data. Preparing data for use in a machine learning (ML) system is time consuming, tedious, and error prone. A reasonable rule of thumb is that data preparation requires at least 80 percent of the total time needed to create an ML system. There are three main phases of data preparation: cleaning, normalizing and encoding, and splitting. Each of the three main phases has several steps.
Jul-6-2020, 23:48:27 GMT