ML Work-Flow (Part 5) – Feature Preprocessing - A Blog From Human-engineer-being
We already discussed first four steps of ML work-flow. So far, we preprocessed crude data by DICTR (Discretization, Integration, Cleaning, Transformation, Reduction), then applied a way of feature extraction procedure to convert data into machine understandable representation, and finally divided data into different bunches like train and test sets . Now, it is time to preprocess feature values and make them ready for the state of art ML model;). You may ask "Why are we so concerned about these?" Okay, I hope now we are clear why we are concerned about these. Henceforth, I'll try to emphasis some basic stuff in our toolkit for feature preprocessing. Caveat 1: One common problem of Scaling and Standardization is you need to keep min and max for Scaling, mean and variance values for Standardization for the novel data and the test time.
Sep-3-2016, 13:21:03 GMT